Skip to content

Apache TVM v0.15.0

Compare
Choose a tag to compare
@ysh329 ysh329 released this 19 Jan 00:56
· 1713 commits to main since this release
a340dbe

Introduction

NOTE: This is last release version before unity branch switch as main branch. No unity features.

The TVM community has worked since the v0.14.0 release to deliver the following new exciting improvements! The main tags are below (bold text is with lots of progress):

  • Community, RFCs
  • Adreno, ArmComputeLibrary, Metal, cuda & cutlass & tensorrt, micoNPU, Runtime
  • Frontend & Relay
  • Arith, TOPI, TIR, TVMScript
  • Docs, CI, Misc, BugFix

Please visit the full listing of commits for a complete view: v0.14.0...v0.15.0.

Community

  • #16172 - Yixin Dong -> Reviewer
  • #16162 - Shuai Yuan -> Committer
  • #16164 - Qiang Zhang -> Committer
  • #16166 - Bohan Hou -> PMC
  • #16165 - Ruihang Lai -> PMC

RFCs

  • #105 - Add a new backend language——SYCL

Adreno

  • #15991 - [CI] Enhancements to Adreno specific CI utils
  • #15786 - [TOPI] Add conv2d transpose nchw texture schedule

Arith

  • #16227 - Simplify nested if_then_else when constant is appearing in then_expr

ArmComputeLibrary

  • #15990 - [ACL] Update Compute Library to v23.08

Metal

  • #16192 - [Device] Fix metal warp size
  • #16033 - [Codegen] Disable cross-function call in Metal codegen

cuda & cutlass & tensorrt

  • #16061 - [CUDA] Add an option for profiling cuda kernels

micoNPU

  • #16003 - [microNPU][ETHOSU] Fix ConcatRewriter args processing
  • #15929 - [microNPU][ETHOSU] Fix rounding mode in requantize operation

Runtime

  • #15896 - [CLML] Fix for CLML ops and enable more test case
  • #16133 - Parallel-for with threading backend
  • #16066 - Support clear global memory allocators
  • #16030 - Introduce TVM_MODULE_VTABLE Macros

BugFix

  • #16269 - Update pillow usage
  • #16272 - Fixed Inappropriate Logical Expression
  • #16216 - [TIR] Fix dynamic smem merge leaf alloc
  • #16190 - Fix the error of reloading the model library on the ROCm platform: "MIOpen Error: No invoker was registered for convolution forward.”
  • #16167 - [Relay][Pytorch] Fix missing .dtype
  • #16091 - [Fix] Fix topi.rms_norm with float32 upscale
  • #16081 - [Fix] Broken Windows Build with LLVM
  • #16051 - [Fix][TIR] Fix dtype issues for match_buffer and ramp node
  • #14655 - [VTA] Fix FSIM compile error on macOS
  • #16021 - [FFI] Typo fix of IncRef to DecRef
  • #16010 - [Fix][TIR] fix mul dtype mismatch
  • #16000 - [Fix][TIR] fix symbolic strides lower
  • #15970 - [Hotfix] Mark python-FFI handling with TVM_DLL
  • #15965 - [CI] Better to pass the build folder

CI

  • #16110 - Refactor unittest folder
  • #16055 - Fix broken links about Jenkins
  • #16062 - Use LLVM 17 for tests on ci_arm
  • #16018 - [Tests] Fix work_dir location used by test_micro_tuning_with_meta_schedule
  • #16019 - [Tests] Check int8+int32 testcases in test_estimate_peak_flops_cpu
  • #16017 - [Tests] Fix str vs. int comparison in test_num_threads

Docs

  • #16282 - [Doc] Fix minor error in doc (Add an operator to Relay)
  • #16152 - [DOC] Add v0.14.0 docs to site
  • #16127 - Revert "[#15157][Rust][Doc] Re-enable the Rust documentation build (#15213)"
  • #16097 - Add missing backtick to contribute/code_guide.rst
  • #16089 - Fix error on linting by adding --rev argument
  • #16024 - Update release_process.rst about version number modification

Frontend & Relay

  • #16243 - [TFLite] Add support for quantized mirror pad
  • #15914 - [TFLite]Support quantized SQUARE
  • #16159 - [KERAS] Fix bug concat convert for NCHW
  • #16319 - [Torch] add aten:broadcast_to
  • #16131 - [Pytorch] Add support for aten::unflatten
  • #16105 - [Pytorch] Add support for aten::bitwise_and
  • #16079 - [Pytorch] Add support for aten::swapaxes operator
  • #15502 - [Pytorch] aten::copy_ support for pytorch
  • #16180 - [Pytorch] Fix bug when converting models with torch.nn.ParameterList
  • #16143 - [Pytorch] Add support for aten::scaled_dot_product_attention
  • #16123 - [Pytorch] Add support for aten::linalg_vector_norm
  • #16171 - [Frontend] Preserve Pytorch Span Names
  • #16217 - [Frontend][QNN] fix access param_debug_name_map to node output name in fx-quantized graph node replacement
  • #16199 - [Frontend] Add support for aten::concat
  • #16151 - conv3d depthwise bug fix
  • #15928 - Expose qnn ops directly from relay.qnn module

TOPI

  • #16259 - Add support for group_conv3d_transpose_ncdhw for generic
  • #16052 - Enhance topi.nn.matmul
  • #16080 - Reduce code redundancy in conv2d weights transformation
  • #16248 - [TOPI] Add support for group_conv1d_transpose_ncw for generic
  • #16106 - [TOPI] Add conv2d NHWC hybrid schedule for arm_cpu

TIR

  • #16239 - [Schedule] TileWithTensorIntrin skip incorrect ComputeInline for input-padding
  • #16236 - ConvertSSA process entry func first
  • #16070 - [Transform] Introduce new InjectPermutedLayout pass
  • #16083 - Enhance Python Type Annotations for TIR Expr
  • #16073 - Support more mma intrinsics and get_mma_intrin_group utility
  • #16076 - Enhance Python Type Annotations for TIR stmt
  • #16074 - Fix the thread binding iter_var dtype in Bind primitive
  • #16063 - Fix pass RenewDefs error in gather/take case
  • #16027 - Fix software pipeline with dynamic loop extent

TVMScript

  • #16271 - Disable concise scoping when the scope stmt is explicitly annotated
  • #16041 - Fix mismatched dtype of IterVar in T.thread_binding
  • #15953 - [TIR] Pretty print TIR LLVM function name
  • #15972 - delete print extra info at parsing

Misc

  • #16279 - replace deprecated np.int with int to avoid crash
  • #16262 - Update conv2d.py
  • #16255 - [Support] Add Interrupt Handling in Pipe
  • #16104 - [LoopPartition] Fix a bug of LoopPartition in single point scenarioes
  • #16231 - [Target] Add Jetson AGX Orin tags
  • #16221 - remove deprecated np.int in slice converter (pytorch)
  • #16214 - [Python] Fix setup.py for inplace build
  • #16174 - Bump cryptography from 37.0.2 to 41.0.6 in /docker/python
  • #16202 - Fix IRModule initialization with attrs
  • #16176 - Enable ccache to accelerate contrib compilation
  • #15968 - Add missing backtick
  • #16034 - [Packaging] Include BYOC dynamic libraries into wheel
  • #16087 - Add _ffi_api.py under script folder
  • #16039 - [Target] Support obtain l2 cache size from target
  • #16065 - [Pylint] fix pylint issues from test_random to test_tedd
  • #16031 - [TRT] fix outdated module building method in tensorrt
  • #16032 - [CMake] Use llvm-config to locate Findzstd.cmake
  • #16023 - [Pylint] fix pylint issues for thrust&tflite_runtime&util
  • #15998 - [Codegen] Add shuffle for cuda and metal
  • #16015 - [Pylint] fix pylint issues for cblas
  • #15955 - [FFI][Python] Handle error propagation when line number is missing
  • #15982 - Bump werkzeug from 2.2.3 to 3.0.1 in /apps/microtvm
  • #15966 - [CMake] Fix order of GNUInstallDirs module
  • #15952 - Update ci_arm Docker tag
  • #15940 - [Minor] Fix compilation warnings for clang
  • #15947 - Bump urllib3 from 1.26.9 to 1.26.18 in /docker/python
  • #15835 - [CodeGenC][Redo] Handle GlobalVar callee as internal function call
  • #15945 - Bump urllib3 from 1.26.15 to 1.26.18 in /apps/microtvm