forked from open-mmlab/mmdeploy
-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
docs(zh_cn): add architect (open-mmlab#882)
* docs(zh_cn): add architect docs(en): add architect fix(docs): readthedocs index * docs(en): update architect.md * docs(README.md): update * docs(architecture): fix review advices
- Loading branch information
1 parent
91a060f
commit 3fa1582
Showing
6 changed files
with
278 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,137 @@ | ||
# mmdeploy Architecture | ||
|
||
This article mainly introduces the functions of each directory of mmdeploy and how it works from model conversion to real inference. | ||
|
||
## Take a general look at the directory structure | ||
|
||
The entire mmdeploy can be seen as two independent parts: model conversion and SDK. | ||
|
||
We introduce the entire repo directory structure and functions, without having to study the source code, just have an impression. | ||
|
||
Peripheral directory features: | ||
|
||
```bash | ||
$ cd /path/to/mmdeploy | ||
$ tree -L 1 | ||
. | ||
├── CMakeLists.txt # Compile custom operator and cmake configuration of SDK | ||
├── configs # Algorithm library configuration for model conversion | ||
├── csrc # SDK and custom operator | ||
├── demo # FFI interface examples in various languages, such as csharp, java, python, etc. | ||
├── docker # docker build | ||
├── mmdeploy # python package for model conversion | ||
├── requirements # python requirements | ||
├── service # Some small boards not support python, we use C/S mode for model conversion, here is server code | ||
├── tests # unittest | ||
├── third_party # 3rd party dependencies required by SDK and FFI | ||
└── tools # Tools are also the entrance to all functions, such as onnx2xx.py, profile.py, test.py, etc. | ||
``` | ||
|
||
It should be clear | ||
|
||
- Model conversion mainly depends on `tools`, `mmdeploy` and small part of `csrc` directory; | ||
- SDK is consist of three directories: `csrc`, `third_party` and `demo`. | ||
|
||
## Model Conversion | ||
|
||
Here we take ViT of mmcls as model example, and take ncnn as inference backend example. Other models and inferences are similar. | ||
|
||
Let's take a look at the mmdeploy/mmdeploy directory structure and get an impression: | ||
|
||
```bash | ||
. | ||
├── apis # The api used by tools is implemented here, such as onnx2ncnn.py | ||
│ ├── calibration.py # trt dedicated collection of quantitative data | ||
│ ├── core # Software infrastructure | ||
│ ├── extract_model.py # Use it to export part of onnx | ||
│ ├── inference.py # Abstract function, which will actually call torch/ncnn specific inference | ||
│ ├── ncnn # ncnn Wrapper | ||
│ └── visualize.py # Still an abstract function, which will actually call torch/ncnn specific inference and visualize | ||
.. | ||
├── backend # Backend wrapper | ||
│ ├── base # Because there are multiple backends, there must be an OO design for the base class | ||
│ ├── ncnn # This calls the ncnn python interface for model conversion | ||
│ │ ├── init_plugins.py # Find the path of ncnn custom operators and ncnn tools | ||
│ │ ├── onnx2ncnn.py # Wrap `mmdeploy_onnx2ncnn` into a python interface | ||
│ │ ├── quant.py # Wrap `ncnn2int8` as a python interface | ||
│ │ └── wrapper.py # Wrap pyncnn forward API | ||
.. | ||
├── codebase # Algorithm rewriter | ||
│ ├── base # There are multiple algorithms here that we need a bit of OO design | ||
│ ├── mmcls # mmcls related model rewrite | ||
│ │ ├── deploy # mmcls implementation of base abstract task/model/codebase | ||
│ │ └── models # Real model rewrite | ||
│ │ ├── backbones # Rewrites of backbone network parts, such as multiheadattention | ||
│ │ ├── heads # Such as MultiLabelClsHead | ||
│ │ ├── necks # Such as GlobalAveragePooling | ||
│.. | ||
├── core # Software infrastructure of rewrite mechanism | ||
├── mmcv # Rewrite mmcv | ||
├── pytorch # Rewrite pytorch operator for ncnn, such as Gemm | ||
.. | ||
``` | ||
|
||
Each line above needs to be read, don't skip it. | ||
|
||
When typing `tools/deploy.py` to convert ViT, these are 3 things: | ||
|
||
1. Rewrite of mmcls ViT forward | ||
2. ncnn does not support `gather` opr, customize and load it with libncnn.so | ||
3. Run exported ncnn model with real inference, render output, and make sure the result is correct | ||
|
||
### 1. Rewrite `forward` | ||
|
||
Because when exporting ViT to onnx, it generates some operators that ncnn doesn't support perfectly, mmdeploy's solution is to hijack the forward code and change it. The output onnx is suitable for ncnn. | ||
|
||
For example, rewrite the process of `conv -> shape -> concat_const -> reshape` to `conv -> reshape` to trim off the redundant `shape` and `concat` operator. | ||
|
||
All mmcls algorithm rewriters are in the `mmdeploy/codebase/mmcls/models` directory. | ||
|
||
### 2. Custom Operator | ||
|
||
Operators customized for ncnn are in the `csrc/mmdeploy/backend_ops/ncnn/` directory, and are loaded together with `libncnn.so` after compilation. The essence is in hotfix ncnn, which currently implements these operators: | ||
|
||
- topk | ||
- tensorslice | ||
- shpe | ||
- gather | ||
- expand | ||
- constantofshape | ||
|
||
### 3. Model Conversion and testing | ||
|
||
We first use the modified `mmdeploy_onnx2ncnn`to convert model, then inference with`pyncnn` and custom ops. | ||
|
||
When encountering a framework such as snpe that does not support python well, we use C/S mode: wrap a server with protocols such as gRPC, and forward the real inference output. | ||
|
||
For Rendering, mmdeploy directly uses the rendering API of upstream algorithm codebase. | ||
|
||
## SDK | ||
|
||
After the model conversion completed, the SDK compiled with C++ can be used to execute on different platforms. | ||
|
||
Let's take a look at the csrc/mmdeploy directory structure: | ||
|
||
```bash | ||
. | ||
├── apis # csharp, java, go, Rust and other FFI interfaces | ||
├── backend_ops # Custom operators for each inference framework | ||
├── CMakeLists.txt | ||
├── codebase # The type of results preferred by each algorithm framework, such as multi-use bbox for detection task | ||
├── core # Abstraction of graph, operator, device and so on | ||
├── device # Implementation of CPU/GPU device abstraction | ||
├── execution # Implementation of the execution abstraction | ||
├── graph # Implementation of graph abstraction | ||
├── model # Implement both zip-compressed and uncompressed work directory | ||
├── net # Implementation of net, such as wrap ncnn forward C API | ||
├── preprocess # Implement preprocess | ||
└── utils # OCV tools | ||
``` | ||
|
||
The essence of the SDK is to design a set of abstraction of the computational graph, and combine the **multiple models'** | ||
|
||
- preprocess | ||
- inference | ||
- postprocess | ||
|
||
Provide FFI in multiple languages at the same time. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,137 @@ | ||
# mmdeploy 各目录功能 | ||
|
||
本文主要介绍 mmdeploy 各目录功能,以及从模型到具体推理框架是怎么工作的。 | ||
|
||
## 一、大致看下目录结构 | ||
|
||
整个 mmdeploy 可以看成比较独立的两部分:模型转换 和 SDK。 | ||
|
||
我们介绍整个 repo 目录结构和功能,不必细究源码、有个印象即可。 | ||
|
||
外围目录功能: | ||
|
||
```bash | ||
$ cd /path/to/mmdeploy | ||
$ tree -L 1 | ||
. | ||
├── CMakeLists.txt # 编译模型转换自定义算子和 SDK 的 cmake 配置 | ||
├── configs # 模型转换要用的算法库配置 | ||
├── csrc # SDK 和自定义算子 | ||
├── demo # 各语言的 ffi 接口应用实例,如 csharp、java、python 等 | ||
├── docker # docker build | ||
├── mmdeploy # 用于模型转换的 python 包 | ||
├── requirements # python 包安装依赖 | ||
├── service # 有些小板子不能跑 python,模型转换用的 C/S 模式。这个目录放 Server | ||
├── tests # 单元测试 | ||
├── third_party # SDK 和 ffi 要的第三方依赖 | ||
└── tools # 工具,也是一切功能的入口。除了 deploy.py 还有 onnx2xx.py、profile.py 和 test.py | ||
``` | ||
|
||
这样大致应该清楚了 | ||
|
||
- 模型转换主要看 tools + mmdeploy + 小部分 csrc 目录; | ||
- 而 SDK 的本体在 csrc + third_party + demo 三个目录。 | ||
|
||
## 二、模型转换 | ||
|
||
模型以 mmcls 的 ViT 为例,推理框架就用 ncnn 举例。其他模型、推理都是类似的。 | ||
|
||
我们看下 mmdeploy/mmdeploy 目录结构,有个印象即可: | ||
|
||
```bash | ||
. | ||
├── apis # tools 工具用的 api,都是这里实现的,如 onnx2ncnn.py | ||
│ ├── calibration.py # trt 专用收集量化数据 | ||
│ ├── core # 软件脚手架 | ||
│ ├── extract_model.py # onnx 模型只想导出一部分,切 onnx 用的 | ||
│ ├── inference.py # 抽象函数,实际会调 torch/ncnn 具体的 inference | ||
│ ├── ncnn # 引用 backend/ncnn 的函数,只是包了一下 | ||
│ └── visualize.py # 还是抽象函数,实际会调用 torch/ncnn 具体的 inference 和 visualize | ||
.. | ||
├── backend # 具体的 backend 包装 | ||
│ ├── base # 因为有多个 backend,所以得有个 base 类的 OO 设计 | ||
│ ├── ncnn # 这里为模型转换调用 ncnn python 接口 | ||
│ │ ├── init_plugins.py # 找 ncnn 自定义算子和 ncnn 工具的路径 | ||
│ │ ├── onnx2ncnn.py # 把 `mmdeploy_onnx2ncnn` 封装成 python 接口 | ||
│ │ ├── quant.py # 封装 `ncnn2int8` 工具为 python 接口 | ||
│ │ └── wrapper.py # 封装 pyncnn forward 接口 | ||
.. | ||
├── codebase # mm 系列算法 forward 重写 | ||
│ ├── base # 有多个算法,需要点 OO 设计 | ||
│ ├── mmcls # mmcls 相关模型重写 | ||
│ │ ├── deploy # mmcls 对 base 抽象 task/model/codebase 的实现 | ||
│ │ └── models # 开始真正的模型重写 | ||
│ │ ├── backbones # 骨干网络部分的重写,例如 multiheadattention | ||
│ │ ├── heads # 例如 MultiLabelClsHead | ||
│ │ ├── necks # 例如 GlobalAveragePooling | ||
│.. | ||
├── core # 软件脚手架,重写机制怎么实现的 | ||
├── mmcv # mmcv 有的 opr 也需要重写 | ||
├── pytorch # 针对 ncnn 重写 torch 的 opr,例如 Gemm | ||
.. | ||
``` | ||
|
||
上面的每一行是需要读的,请勿跳过。 | ||
|
||
当敲下`tools/deploy.py` 转换 ViT,核心是这 3 件事: | ||
|
||
1. mmcls ViT forward 过程的重写 | ||
2. ncnn 不支持 gather opr,自定义一下、和 libncnn.so 一起加载 | ||
3. 真实跑一遍,渲染结果,确保正确 | ||
|
||
### 1. forward 重写 | ||
|
||
因为 onnx 会生成稀碎的算子、ncnn 也不是完美支持 onnx,所以 mmdeploy 的方案是劫持有问题的 forward 代码、改成适合 ncnn 的 onnx 结果。 | ||
|
||
例如把 `conv -> shape -> concat_const -> reshape` 过程改成 `conv -> reshape`,削掉多余的 `shape` 和 `concat` 算子。 | ||
|
||
所有的 mmcls 算法重写都在 `mmdeploy/codebase/mmcls/models`目录。 | ||
|
||
### 2. 自定义算子 | ||
|
||
针对 ncnn 自定义的算子都在 `csrc/mmdeploy/backend_ops/ncnn/`目录,编译后和 libncnn.so 一起加载。本质是在 hotfix ncnn,目前实现了 | ||
|
||
- topk | ||
- tensorslice | ||
- shpe | ||
- gather | ||
- expand | ||
- constantofshape | ||
|
||
### 3. 转换和测试 | ||
|
||
ncnn 的兼容性较好,转换用的是修改后的 `mmdeploy_onnx2ncnn`,推理封装了 `pyncnn`+ 自定义 ops。 | ||
|
||
遇到 snpe 这种不支持 python 的框架,则使用 C/S 模式:用 gRPC 等协议封装一个 server,转发真实的推理结果。 | ||
|
||
渲染使用上游算法框架的渲染 API,mmdeploy 自身不做绘制。 | ||
|
||
## 三、SDK | ||
|
||
模型转换完成后,可用 C++ 编译的 SDK 执行在不同平台上。 | ||
|
||
我们看下 csrc/mmdeploy 目录结构: | ||
|
||
```bash | ||
. | ||
├── apis # Csharp、java、go、Rust 等 ffi 接口 | ||
├── backend_ops # 各推理框架的自定义算子 | ||
├── CMakeLists.txt | ||
├── codebase # 各 mm 算法框架偏好的结果类型,例如检测任务多用 bbox | ||
├── core # 脚手架,对图、算子、设备的抽象 | ||
├── device # CPU/GPU device 抽象的实现 | ||
├── execution # 对 exec 抽象的实现 | ||
├── graph # 对图抽象的实现 | ||
├── model # 实现 zip 压缩和非压缩两种工作目录 | ||
├── net # net 的具体实现,例如封装了 ncnn forward C 接口 | ||
├── preprocess # 预处理的实现 | ||
└── utils # OCV 工具类 | ||
``` | ||
|
||
SDK 本质是设计了一套计算图的抽象,把**多个模型**的 | ||
|
||
- 预处理 | ||
- 推理 | ||
- 后处理 | ||
|
||
调度起来,同时提供多种语言的 ffi。 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters