-
Support optimization for models from different kinds of deep learning architecture, e.g. TensorFlow/Caffe/PyTorch.
-
Support compiling models for different runtime, e.g. OpenVINO IR/TensorFlow Serving/TensorFlow Lite/TensorRT.
-
Simplified interfaces for the workflow.
-
Prepare a model file in the specified format, such as
H5
,Checkpoint
,Frozen Graph
,ONNX
. -
Create a json file which must match config_schema.json.
-
Install model_compiler and compile the model, you can refer to benchmark link or examples link:
cd {Adlik_root_dir}/model_compiler
python3 -m pip install .
-
When you compile the model to tensorflow lite runtime, you can quantify the model by set the parameter
optimization
,supported_types
,supported_ops
,inference_input_type
,inference_output_type
in json file or environment variable, you can refer to quantify tensorflow lite model. -
When you compiler the model to tensorrt runtime, you can quantify the model to FP16 by set the parameter
enable_fp16
andenable_strict_types
in json file or environment variable.