forked from open-mmlab/mmdeploy
-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
docs(project): sync en and zh docs (open-mmlab#842)
* docs(en): update file structure * docs(zh_cn): update * docs(structure): update * docs(snpe): update * docs(README): update * fix(CI): update * fix(CI): index.rst error * fix(docs): update * fix(docs): remove mermaid * fix(docs): remove useless * fix(docs): update link * docs(en): update * docs(en): update * docs(zh_cn): remove \[ * docs(zh_cn): format * docs(en): remove blank * fix(CI): doc link error * docs(project): remove "./" prefix * docs(zh_cn): fix mdformat * docs(en): update title * fix(CI): update docs
- Loading branch information
1 parent
670a504
commit 127125f
Showing
74 changed files
with
2,524 additions
and
231 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
44 changes: 0 additions & 44 deletions
44
docs/en/02-how-to-run/how_to_measure_performance_of_models.md
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,67 @@ | ||
# Quantize model | ||
|
||
## Why quantization ? | ||
|
||
The fixed-point model has many advantages over the fp32 model: | ||
|
||
- Smaller size, 8-bit model reduces file size by 75% | ||
- Benefit from the smaller model, the Cache hit rate is improved and inference would be faster | ||
- Chips tend to have corresponding fixed-point acceleration instructions which are faster and less energy consumed (int8 on a common CPU requires only about 10% of energy) | ||
|
||
The size of the installation package and the heat generation are the key indicators of the mobile terminal evaluation APP; | ||
On the server side, quantization means that you can maintain the same QPS and improve model precision in exchange for improved accuracy. | ||
|
||
## Post training quantization scheme | ||
|
||
Taking ncnn backend as an example, the complete workflow is as follows: | ||
|
||
<div align="center"> | ||
<img src="../_static/image/quant_model.png"/> | ||
</div> | ||
|
||
mmdeploy generates quantization table based on static graph (onnx) and uses backend tools to convert fp32 model to fixed point. | ||
|
||
Currently mmdeploy support ncnn with PTQ. | ||
|
||
## How to convert model | ||
|
||
[After mmdeploy installation](../01-how-to-build/build_from_source.md), install ppq | ||
|
||
```bash | ||
git clone https://github.com/openppl-public/ppq.git | ||
cd ppq | ||
git checkout edbecf4 # import some feature | ||
pip install -r requirements.txt | ||
python3 setup.py install | ||
``` | ||
|
||
Back in mmdeploy, enable quantization with the option 'tools/deploy.py --quant'. | ||
|
||
```bash | ||
cd /path/to/mmdeploy | ||
export MODEL_PATH=/path/to/mmclassification/configs/resnet/resnet18_8xb16_cifar10.py | ||
export MODEL_CONFIG=https://download.openmmlab.com/mmclassification/v0/resnet/resnet18_b16x8_cifar10_20210528-bd6371c8.pth | ||
|
||
python3 tools/deploy.py configs/mmcls/classification_ncnn-int8_static.py ${MODEL_CONFIG} ${MODEL_PATH} /path/to/self-test.png --work-dir work_dir --device cpu --quant --quant-image-dir /path/to/images | ||
... | ||
``` | ||
|
||
Description | ||
|
||
| Parameter | Meaning | | ||
| :---------------: | :--------------------------------------------------------------: | | ||
| --quant | Enable quantization, the default value is False | | ||
| --quant-image-dir | Calibrate dataset, use Validation Set in MODEL_CONFIG by default | | ||
|
||
## Custom calibration dataset | ||
|
||
Calibration set is used to calculate quantization layer parameters. Some DFQ (Data Free Quantization) methods do not even require a dataset. | ||
|
||
- Create a new folder, just put in the picture (no directory structure required, no negative example required, no filename format required) | ||
- The image needs to be the data comes from real scenario otherwise the accuracy would be drop | ||
- You can not quantize model with test dataset | ||
| Type | Train dataset | Validation dataset | Test dataset | Calibration dataset | | ||
| ----- | ------------- | ------------------ | ------------- | ------------------- | | ||
| Usage | QAT | PTQ | Test accuracy | PTQ | | ||
|
||
It is highly recommended that [verifying model precision](profile_model.md) after quantization. [Here](../03-benchmark/quantization.md) is some quantization model test result. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
# Quantization test result | ||
|
||
Currently mmdeploy support ncnn quantization | ||
|
||
## Quantize with ncnn | ||
|
||
### mmcls | ||
|
||
| model | dataset | fp32 top-1 (%) | int8 top-1 (%) | | ||
| :--------------------------------------------------------------------------------------------------------------------------: | :---------: | :------------: | :------------: | | ||
| [ResNet-18](https://github.com/open-mmlab/mmclassification/blob/master/configs/resnet/resnet18_8xb16_cifar10.py) | Cifar10 | 94.82 | 94.83 | | ||
| [ResNeXt-32x4d-50](https://github.com/open-mmlab/mmclassification/blob/master/configs/resnext/resnext50-32x4d_8xb32_in1k.py) | ImageNet-1k | 77.90 | 78.20\* | | ||
| [MobileNet V2](https://github.com/open-mmlab/mmclassification/blob/master/configs/mobilenet_v2/mobilenet-v2_8xb32_in1k.py) | ImageNet-1k | 71.86 | 71.43\* | | ||
| [HRNet-W18\*](https://github.com/open-mmlab/mmclassification/blob/master/configs/hrnet/hrnet-w18_4xb32_in1k.py) | ImageNet-1k | 76.75 | 76.25\* | | ||
|
||
Note: | ||
|
||
- Because of the large amount of imagenet-1k data and ncnn has not released Vulkan int8 version, only part of the test set (4000/50000) is used. | ||
- The accuracy will vary after quantization, and it is normal for the classification model to increase by less than 1%. | ||
|
||
### OCR detection | ||
|
||
| model | dataset | fp32 hmean | int8 hmean | | ||
| :---------------------------------------------------------------------------------------------------------------: | :-------: | :--------: | :------------: | | ||
| [PANet](https://github.com/open-mmlab/mmocr/blob/main/configs/textdet/panet/panet_r18_fpem_ffm_600e_icdar2015.py) | ICDAR2015 | 0.795 | 0.792 @thr=0.9 | | ||
|
||
Note: [mmocr](https://github.com/open-mmlab/mmocr) Uses 'shapely' to compute IoU, which results in a slight difference in accuracy |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.