The fixed-point model has many advantages over the fp32 model:
- Smaller size, 8-bit model reduces file size by 75%
- Benefit from the smaller model, the Cache hit rate is improved and inference would be faster
- Chips tend to have corresponding fixed-point acceleration instructions which are faster and less energy consumed (int8 on a common CPU requires only about 10% of energy)
The size of the installation package and the heat generation are the key indicators of the mobile terminal evaluation APP; On the server side, quantization means that you can maintain the same QPS and improve model precision in exchange for improved accuracy.
Taking ncnn backend as an example, the complete workflow is as follows:
mmdeploy generates quantization table based on static graph (onnx) and uses backend tools to convert fp32 model to fixed point.
Currently mmdeploy support ncnn with PTQ.
After mmdeploy installation, install ppq
git clone https://github.com/openppl-public/ppq.git
cd ppq
git checkout edbecf4 # import some feature
pip install -r requirements.txt
python3 setup.py install
Back in mmdeploy, enable quantization with the option 'tools/deploy.py --quant'.
cd /path/to/mmdeploy
export MODEL_PATH=/path/to/mmclassification/configs/resnet/resnet18_8xb16_cifar10.py
export MODEL_CONFIG=https://download.openmmlab.com/mmclassification/v0/resnet/resnet18_b16x8_cifar10_20210528-bd6371c8.pth
python3 tools/deploy.py configs/mmcls/classification_ncnn-int8_static.py ${MODEL_CONFIG} ${MODEL_PATH} /path/to/self-test.png --work-dir work_dir --device cpu --quant --quant-image-dir /path/to/images
...
Description
Parameter | Meaning |
---|---|
--quant | Enable quantization, the default value is False |
--quant-image-dir | Calibrate dataset, use Validation Set in MODEL_CONFIG by default |
Calibration set is used to calculate quantization layer parameters. Some DFQ (Data Free Quantization) methods do not even require a dataset.
- Create a new folder, just put in the picture (no directory structure required, no negative example required, no filename format required)
- The image needs to be the data comes from real scenario otherwise the accuracy would be drop
- You can not quantize model with test dataset
Type Train dataset Validation dataset Test dataset Calibration dataset Usage QAT PTQ Test accuracy PTQ
It is highly recommended that verifying model precision after quantization. Here is some quantization model test result.