merge master and resolve conflicts (#561)

* fix pose demo and windows build (#307) * add postprocessing_masks gpu version (#276) * add postprocessing_masks gpu version * default device cpu * pre-commit fix Co-authored-by: hadoop-basecv <[email protected]> * fixed a bug causes text-recognizer to fail when (non-NULL) empty bboxes list is passed (#310) * [Fix] include missing <type_traits> for formatter.h (#313) * fix formatter * relax GCC version requirement * [Fix] MMEditing cannot save results when testing (#336) * fix show * lint * remove redundant codes * resolve comment * type hint * docs(build): fix typo (#352) * docs(build): add missing build option * docs(build): add onnx install * style(doc): trim whitespace * docs(build): revert install onnx * docs(build): add ncnn LD_LIBRARY_PATH * docs(build): fix path error * fix openvino export tmp model, add binary flag (#353) * init circleci (#348) * fix wrong input mat type (#362) * fix wrong input mat type * fix lint * fix(docs): remove redundant doc tree (#360) * fix missing ncnn_DIR & InferenceEngine_DIR (#364) * Fix mmdet openvino dynamic 300x300 cfg base (#372) * Fix: add onnxruntime building option in gpu dockerfile (#366) * Tutorial 03: torch2onnx (#365) * upload doc * add images * resolve comments * update translation * [Docs] fix ncnn docs (#378) * fix ncnn docs` * update 0216 * typo-fix (#397) * add CUDA_TOOKIT_ROOT_DIR as tensorrt detect dir (#357) * add CUDA_TOOKIT_ROOT_DIR as tensorrt detect dir * Update FindTENSORRT.cmake * Fix docs (#398) * ort_net ONNX_TENSOR_ELEMENT_DATA_TYPE_BOOL (#383) * fix wrong buffer which will case onnxruntime-gpu crash with segmentaion (#363) * fix wrong buffer which will case onnxruntime-gpu crash with segmentaion * fix check * fix build error * remove unused header * fix benchmark (#411) * Add `sm_53` in cuda.cmake for Jetson Nano which will cashe when process sdk predict. (#407) * [Fix] fix feature test for `std::source_location` (#416) * fix feature test for `std::source_location` * suppress msvc warnings * fix consistency * fix format string (#417) * [Fix] Fix seg name (#394) * fix seg name * use default name Co-authored-by: dongchunyu.vendor <[email protected]> * 【Docs】Add ipython notebook tutorial (#234) * add ipynb file * rename file * add open in colab tag * fix lint and add img show * fix open in colab link * fix comments * fix pre-commit config * fix mmpose api (#396) * fix mmpose api * use fmt::format instead * fix potential nullptr access * [Fix] support latest spdlog (#423) * support formatting `PixelFormat` & `DataType` * format enum for legacy spdlog * fix format * fix pillarencode (#331) * fix ONNXRuntime cuda test bug (#438) * Fix ci in master branch (#441) * [Doc] Improve Jetson tutorial install doc (#381) * Improve Jetson build doc * add torchvision in the doc * Fix lint * Fix lint * Fix lint * Fix arg bug * remove incorrect process * Improve doc * Add more detail on `Conda` * Add python version detail * Install `onnx` instead of `onnxruntime` * Fix gramma * Fix gramma * Update Installation detail and fix some doc detail * Update how_to_install_mmdeploy_on_jetsons.md * Fix tensorrt and cudnn path * Improve FAQ * Improve FAQs * pplcv not switch branch since the `sm_53` missing * Update how_to_install_mmdeploy_on_jetsons.md * Update how_to_install_mmdeploy_on_jetsons.md * Update how_to_install_mmdeploy_on_jetsons.md * Update how_to_install_mmdeploy_on_jetsons.md * Improve doc * Update how_to_install_mmdeploy_on_jetsons.md * export `TENSORRT_DIR` * Using pre-build cmake to update * Improve sentence and add jetpack version * Improve sentence * move TENSORRT_DIR in the `Make TensorRT env` step * Improve CUDA detail * Update how_to_install_mmdeploy_on_jetsons.md * Update how_to_install_mmdeploy_on_jetsons.md * Improve conda installation * Improve TensorRT installation * Fix lint * Add pip crash detail and FAQ * Improve pip crash * refine the jetson installation guide * Improve python version * Improve doc, added some detail * Fix lint * Add detail for `Runtime` problem * Fix word * Update how_to_install_mmdeploy_on_jetsons.md Co-authored-by: lvhan028 <[email protected]> * Version comments added, torch install steps added. (#449) * [Docs] Fix API documentation (#443) * [Docs] Fix API documentation * add onnx dependency in readthedocs.txt * fix dependencies * [Fix] Fix display bugs for windows (#451) * fix issue 330 for windows * fix code * fix lint * fix all platform * [Docs] Minor fixes and translation of installation tutorial for Jetson (#415) * minor fixes * add Jetson installation * updated zh_cn based on new en version * If a cuda launch error occurs, verify if cuda device requires top_k t… (#479) * If a cuda launch error occurs, verify if cuda device requires top_k to be reduced. * Fixed lint * Clang format * Fixed lint, clang-format * [Fix] set optional arg a default value (#483) * optional default value * resolve comments Co-authored-by: dongchunyu.vendor <[email protected]> * Update: Optimize document (#484) * Update: Optimize document - Minor fixes in styling and grammar - Add support for Jetson Xavier NX (Tested and worked) - Add hardware recommendation - Change JetPack installation guide URL from jp5.0 to jp4.6.1 - Add a note to select "Jetson SDK Components" when using NVIDIA SDK Manager - Change PyTorch wheel save location - Add more dependencies needed for torchvision installation. Otherwise installation error - Simplify torchvision git cloning branch - Add installation times for torchvision, MMCV, versioned-hdf5, ppl.cv, model converter, SDK libraries - Delete "snap" from cmake removal as "apt-get purge" is enough - Add a note on which scenarios you need to append cu da path and libraries to PATH and LD_LIBRARY_PATH - Simplify MMCV git cloning branch - Delete "skip if you don't need MMDeploy C/C++ Inference SDK", because that is the only available inference SDK at the moment - Add more details to object detection demo using C/C++ Inference SDK such as installing MMDetection and converting a model - Add image of inference result - Delete "set env for pip" in troubleshooting because this is already mentioned under "installing Archiconda" Signed-off-by: Lakshantha Dissanayake <[email protected]> * Fix: note style on doc * Fix: Trim trailing whitespaces * Update: add source image before inference * fix: bbox_nms not onnxizing if batch size > 1 (#501) A typo prevents nms from onnxizing correctly if batch size is static and greater than 1. * change seperator of function marker (#499) * [docs] Fix typo in tutorial (#509) * Fix docstring format (#495) * Fix doc common * Fix bugs * Tutorial 04: onnx custom op (#508) * Add tutorial04 * lint * add image * resolve comment * fix mmseg twice resize (#480) * fix mmseg twich resize * remove comment * Fix mask test with mismatched device (#511) * align mask output to cpu device * align ncnn ssd output to torch.Tensor type * --amend * compat mmpose v0.26 (#518) * [Docs] adding new backends when using MMDeploy as a third package (#482) * update doc * refine expression * cn doc * Tutorial 05: ONNX Model Editing (#517) * tutorial 05 * Upload image * resolve comments * resolve comment * fix pspnet torchscript conversion (#538) * fix pspnet torchscript conversion * resolve comment * add IR to rewrite * changing the onnxwrapper script for gpu issue (#532) * changing the onnxwrapper script * gpu_issue * Update wrapper.py * Update wrapper.py * Update runtime.txt * Update runtime.txt * Update wrapper.py Co-authored-by: Chen Xin <[email protected]> Co-authored-by: Shengxi Li <[email protected]> Co-authored-by: hadoop-basecv <[email protected]> Co-authored-by: lzhangzz <[email protected]> Co-authored-by: Yifan Zhou <[email protected]> Co-authored-by: tpoisonooo <[email protected]> Co-authored-by: HinGwenWoong <[email protected]> Co-authored-by: Junjie <[email protected]> Co-authored-by: hanrui1sensetime <[email protected]> Co-authored-by: q.yao <[email protected]> Co-authored-by: Song Lin <[email protected]> Co-authored-by: zly19540609 <[email protected]> Co-authored-by: RunningLeon <[email protected]> Co-authored-by: HinGwenWoong <[email protected]> Co-authored-by: AllentDan <[email protected]> Co-authored-by: dongchunyu.vendor <[email protected]> Co-authored-by: VVsssssk <[email protected]> Co-authored-by: NagatoYuki0943 <[email protected]> Co-authored-by: Johannes L <[email protected]> Co-authored-by: Zaida Zhou <[email protected]> Co-authored-by: chaoqun <[email protected]> Co-authored-by: Lakshantha Dissanayake <[email protected]> Co-authored-by: Yifan Gu <[email protected]> Co-authored-by: Zhiqiang Wang <[email protected]> Co-authored-by: sanjaypavo <[email protected]>
open-mmlab · Jun 8, 2022 · 916c25e · 916c25e
1 parent efb26a9
commit 916c25e
Show file tree

Hide file tree

Showing 41 changed files with 2,458 additions and 192 deletions.
diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
@@ -39,7 +39,7 @@ repos:
     rev: v2.1.0
     hooks:
       - id: codespell
-        args: ["--skip=third_party/*,*.proto"]
+        args: ["--skip=third_party/*,*.ipynb,*.proto"]
 
   - repo: https://github.com/myint/docformatter
     rev: v1.4

diff --git a/csrc/backend_ops/tensorrt/common_impl/nms/allClassNMS.cu b/csrc/backend_ops/tensorrt/common_impl/nms/allClassNMS.cu
@@ -205,6 +205,18 @@ pluginStatus_t allClassNMS_gpu(cudaStream_t stream, const int num, const int num
       (T_BBOX *)bbox_data, (T_SCORE *)beforeNMS_scores, (int *)beforeNMS_index_array,
       (T_SCORE *)afterNMS_scores, (int *)afterNMS_index_array, flipXY);
 
+  cudaError_t code = cudaGetLastError();
+  if (code != cudaSuccess) {
+    // Verify if cuda dev0 requires top_k to be reduced;
+    // sm_53 (Jetson Nano) and sm_62 (Jetson TX2) requires reduced top_k < 1000
+    auto __cuda_arch__ = get_cuda_arch(0);
+    if ((__cuda_arch__ == 530 || __cuda_arch__ == 620) && top_k >= 1000) {
+      printf(
+          "Warning: pre_top_k need to be reduced for devices with arch 5.3, 6.2, got "
+          "pre_top_k=%d\n",
+          top_k);
+    }
+  }
   CSC(cudaGetLastError(), STATUS_FAILURE);
   return STATUS_SUCCESS;
 }
@@ -243,13 +255,7 @@ pluginStatus_t allClassNMS(cudaStream_t stream, const int num, const int num_cla
                            const bool isNormalized, const DataType DT_SCORE, const DataType DT_BBOX,
                            void *bbox_data, void *beforeNMS_scores, void *beforeNMS_index_array,
                            void *afterNMS_scores, void *afterNMS_index_array, bool flipXY) {
-  auto __cuda_arch__ = get_cuda_arch(0);  // assume there is only one arch 7.2 device
-  if (__cuda_arch__ == 720 && top_k >= 1000) {
-    printf("Warning: pre_top_k need to be reduced for devices with arch 7.2, got pre_top_k=%d\n",
-           top_k);
-  }
   nmsLaunchConfigSSD lc(DT_SCORE, DT_BBOX);
-
   for (unsigned i = 0; i < nmsFuncVec.size(); ++i) {
     if (lc == nmsFuncVec[i]) {
       DEBUG_PRINTF("all class nms kernel %d\n", i);

diff --git a/demo/tutorials/tutorials_1.ipynb b/demo/tutorials/tutorials_1.ipynb
diff --git a/docker/GPU/Dockerfile b/docker/GPU/Dockerfile
@@ -82,9 +82,10 @@ RUN cd /root/workspace/mmdeploy &&\
         -DCMAKE_CXX_COMPILER=g++ \
         -Dpplcv_DIR=/root/workspace/ppl.cv/cuda-build/install/lib/cmake/ppl \
         -DTENSORRT_DIR=${TENSORRT_DIR} \
+        -DONNXRUNTIME_DIR=${ONNXRUNTIME_DIR} \
         -DMMDEPLOY_BUILD_SDK_PYTHON_API=ON \
         -DMMDEPLOY_TARGET_DEVICES="cuda;cpu" \
-        -DMMDEPLOY_TARGET_BACKENDS="trt" \
+        -DMMDEPLOY_TARGET_BACKENDS="ort;trt" \
         -DMMDEPLOY_CODEBASES=all &&\
     make -j$(nproc) && make install &&\
     cd install/example  && mkdir -p build && cd build &&\

diff --git a/docs/en/01-how-to-build/android.md b/docs/en/01-how-to-build/android.md
@@ -76,9 +76,9 @@ export OPENCV_ANDROID_SDK_DIR=${PWD}/OpenCV-android-sdk
   <tr>
     <td>ncnn </td>
     <td>A high-performance neural network inference computing framework supporting for android.</br>
-  <b> Now, MMDeploy supports v20211208 and has to use <code>git clone</code> to download it.</b><br>
+  <b> Now, MMDeploy supports v20220216 and has to use <code>git clone</code> to download it.</b><br>
 <pre><code>
-git clone -b 20211208 https://github.com/Tencent/ncnn.git
+git clone -b 20220216 https://github.com/Tencent/ncnn.git
 cd ncnn
 git submodule update --init
 export NCNN_DIR=${PWD}