diff --git a/docs/IE_PLUGIN_DG/ExecutableNetwork.md b/docs/IE_PLUGIN_DG/ExecutableNetwork.md index d9fc8af11abe0a..5f703bcd88091b 100644 --- a/docs/IE_PLUGIN_DG/ExecutableNetwork.md +++ b/docs/IE_PLUGIN_DG/ExecutableNetwork.md @@ -37,7 +37,7 @@ The implementation `CompileNetwork` is fully device-specific. The function accepts a const shared pointer to `ngraph::Function` object and performs the following steps: -1. Applies ngraph passes using `TransformNetwork` function, which defines plugin-specific conversion pipeline. +1. Applies ngraph passes using `TransformNetwork` function, which defines plugin-specific conversion pipeline. To support low precision inference, the pipeline can include Low Precision Transformations. These transformations are usually hardware specific. You can find how to use and configure Low Precisions Transformations in [Low Precision Transformations](@ref openvino_docs_IE_DG_lpt) guide. 2. Maps the transformed graph to a backend specific graph representation (for example, to MKLDNN graph for Intel CPU). 3. Allocates and fills memory for graph weights, backend specific memory handles and so on. diff --git a/docs/IE_PLUGIN_DG/Intro.md b/docs/IE_PLUGIN_DG/Intro.md index 8979d4c74a96b3..5a85573e5432e3 100644 --- a/docs/IE_PLUGIN_DG/Intro.md +++ b/docs/IE_PLUGIN_DG/Intro.md @@ -52,6 +52,7 @@ Detailed guides * [Build](@ref openvino_docs_ie_plugin_dg_plugin_build) a plugin library using CMake\* * Plugin and its components [testing](@ref openvino_docs_ie_plugin_dg_plugin_testing) * [Quantized networks](@ref openvino_docs_ie_plugin_dg_quantized_networks) +* [Low precision transformations](@ref openvino_docs_IE_DG_lpt) guide * [Writing nGraph transformations](@ref ngraph_transformation) guide API References diff --git a/docs/IE_PLUGIN_DG/layout.xml b/docs/IE_PLUGIN_DG/layout.xml index 3dc629d959c13b..bba21ddd206e16 100644 --- a/docs/IE_PLUGIN_DG/layout.xml +++ b/docs/IE_PLUGIN_DG/layout.xml @@ -4,7 +4,78 @@ - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/PluginTransformationPipeline.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/PluginTransformationPipeline.md new file mode 100644 index 00000000000000..7e13077f44f98f --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/PluginTransformationPipeline.md @@ -0,0 +1,17 @@ +# Plugin Transformation Pipeline {#openvino_docs_IE_DG_plugin_transformation_pipeline} + +@sphinxdirective + +.. toctree:: + :maxdepth: 1 + :caption: Executable Network + :hidden: + + Low Precision Transformations + +@endsphinxdirective + +Typical plugin transformation pipeline includes steps: + 1. Common transformations + 2. [Low precision transformations](@ref openvino_docs_IE_DG_lpt) + 3. Plugin specific transformations \ No newline at end of file diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/attributes/avg_pool_precision_preserved.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/attributes/avg_pool_precision_preserved.md new file mode 100644 index 00000000000000..30f7411cbd9772 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/attributes/avg_pool_precision_preserved.md @@ -0,0 +1,11 @@ +# AvgPoolPrecisionPreserved attribute {#openvino_docs_IE_DG_lpt_AvgPoolPrecisionPreserved} + +ngraph::AvgPoolPrecisionPreservedAttribute class represents the `AvgPoolPrecisionPreserved` attribute. + +Utility attribute, which is used only during `AvgPool` operation, precision preserved property definition. + +| Property name | Values | +|---------------|----------------------------------------------| +| Required | Yes | +| Defined | Operation | +| Properties | value (boolean) | \ No newline at end of file diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/attributes/intervals_alignment.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/attributes/intervals_alignment.md new file mode 100644 index 00000000000000..b977fd4a325fe2 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/attributes/intervals_alignment.md @@ -0,0 +1,11 @@ +# IntervalsAlignment attribute {#openvino_docs_IE_DG_lpt_IntervalsAlignment} + +ngraph::IntervalsAlignmentAttribute class represents the `IntervalsAlignment` attribute. + +The attribute defines a subgraph with the same quantization intervals alignment. `FakeQuantize` operations are included. The attribute is used by quantization operations. + +| Property name | Values | +|---------------|----------------------------------------------| +| Required | Yes | +| Defined | Operation | +| Properties | combined interval, minimal interval, minimal levels, preferable precisions | \ No newline at end of file diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/attributes/per_tensor_quantization.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/attributes/per_tensor_quantization.md new file mode 100644 index 00000000000000..03a8a6721779bf --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/attributes/per_tensor_quantization.md @@ -0,0 +1,11 @@ +# PerTensorQuantization attribute {#openvino_docs_IE_DG_lpt_PerTensorQuantization} + +ngraph::PerTensorQuantizationAttribute class represents the `PerTensorQuantization` attribute. + +The attribute defines if the operation input port requires per-tensor quantization. + +| Property name | Values | +|---------------|----------------------------------------------| +| Required | Yes | +| Defined | Operation, input ports | +| Properties | | \ No newline at end of file diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/attributes/precision_preserved.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/attributes/precision_preserved.md new file mode 100644 index 00000000000000..cf75ecc61c6bed --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/attributes/precision_preserved.md @@ -0,0 +1,11 @@ +# PrecisionPreserved attribute {#openvino_docs_IE_DG_lpt_PrecisionPreserved} + +ngraph::PrecisionPreservedAttribute class represents the `PrecisionPreserved` attribute. + +The attribute defines a precision preserved operation. If the attribute is absent, then an operation is not precision preserved. + +| Property name | Values | +|---------------|----------------------------------------------| +| Required | Yes | +| Defined | Operation | +| Properties | value (boolean) | \ No newline at end of file diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/attributes/precisions.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/attributes/precisions.md new file mode 100644 index 00000000000000..0b0c27a4801b20 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/attributes/precisions.md @@ -0,0 +1,11 @@ +# Precisions attribute {#openvino_docs_IE_DG_lpt_Precisions} + +ngraph::PrecisionsAttribute class represents the `Precisions` attribute. + +The attribute defines precision which is required for input/output port or an operation. + +| Property name | Values | +|---------------|----------------------------------------------| +| Required | Yes | +| Defined | Operation, input port, output port | +| Properties | precisions | \ No newline at end of file diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/attributes/quantization_alignment.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/attributes/quantization_alignment.md new file mode 100644 index 00000000000000..66747a63ecdea0 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/attributes/quantization_alignment.md @@ -0,0 +1,11 @@ +# QuantizationAlignment attribute {#openvino_docs_IE_DG_lpt_QuantizationAlignment} + +ngraph::QuantizationAlignmentAttribute class represents the `QuantizationAlignment` attribute. + +The attribute defines a subgraph with the same quantization alignment. `FakeQuantize` operations are not included. The attribute is used by quantization operations. + +| Property name | Values | +|---------------|----------------------------------------------| +| Required | Yes | +| Defined | Operation | +| Properties | value (boolean) | \ No newline at end of file diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/img/low_precision_transformation_pipeline.png b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/img/low_precision_transformation_pipeline.png new file mode 100644 index 00000000000000..749a83c1015f29 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/img/low_precision_transformation_pipeline.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3ee64e2c942110b8dbbc7cb3d200ed7061da6a12a55c0f379378e31db9ae2180 +size 366513 diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/img/low_precision_transformation_pipeline.svg b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/img/low_precision_transformation_pipeline.svg new file mode 100644 index 00000000000000..9292ce92a6a052 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/img/low_precision_transformation_pipeline.svg @@ -0,0 +1 @@ +Step 1PrerequisitesStep 2Markup transformationsStep 3Main transformationsStep 4Cleanup transformationsPullReshapeThroughDequantizationPullTransposeThroughDequantizationngraph::pass::LinOpSequenceFusionMarkupCanBeQuantizedMarkupPrecisionsMarkupPerTensorQuantizationMarkupAvgPoolPrecisionPreservedPropagatePrecisionsAlignQuantizationInttervalsAlignQuantizationParametersAddTransformationAvgPoolTransformationClampTransformationConcatTransformationConvolutionTransformationConvolutionBackpropDataTransformationDepthToSpaceTransformationFakeQuantizeDecompositionTransformationFakeQuantizeTransformationInterpolateTransformationGroupConvolutionTransformationMatMulTransformationMaxPoolTransformationMultiplyTransformationMVNTransformationNormalizeL2TransformationPReluTransformationReduceMaxTransformationReduceMeanTransformationReduceMinTransformationReduceSumTransformationReluTransformationReshapeTransformationSqueezeTransformationShuffleChannelsTransformationSplitTransformationStridedSliceTransformationTransposeTransformationUnsqueezeTransformationVariadicSplitTransformationFoldConvertTransformationFuseConvertTransformationFuseSubtractToFakeQuantizeTransformationFuseMultiplyToFakeQuantizeTransformationMultiplyToGroupConvolutionTransformationSubtractMultiplyToMultiplyAddTransformationFoldFakeQuantizeTransformation \ No newline at end of file diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/img/model_fq_and_convolution.common.png b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/img/model_fq_and_convolution.common.png new file mode 100644 index 00000000000000..37d7e97184a454 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/img/model_fq_and_convolution.common.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b1d9a68912b2dde17c731ed31b090077e6812a84231544ce3d212c0e02b13dfb +size 204085 diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/img/model_fq_and_convolution.common.svg b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/img/model_fq_and_convolution.common.svg new file mode 100644 index 00000000000000..af34cbfa239e1e --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/img/model_fq_and_convolution.common.svg @@ -0,0 +1 @@ +FP32 Convolution with quantized weightsFakeQuantizeFakeQuantizelevels: 256{f32} {1, 3, 299, 299}Parameter{f32} {1, 3, 299, 299}Convolution{f32} {1, 6, 299, 299}Constant{f32} {1, 1, 1, 1}Value:[-12.8]Constant{f32} {1, 1, 1, 1}Value:[12.7]Constant{f32} {1, 1, 1, 1}Value:[-12.8]Constant{f32} {1, 1, 1, 1}Value:[12.7]ResultConstant{i8} {6, 3, 1, 1}Dequantization on weightsMultiply{f32} {6, 3, 1, 1}Convert{f32} {6, 3, 1, 1}Constant{f32} {6, 1, 1, 1}Subtract{f32} {6, 3, 1, 1}Constant{i8} {6, 1, 1, 1}Convert{f32} {6, 1, 1, 1} \ No newline at end of file diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/img/model_fq_and_convolution.transformed.png b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/img/model_fq_and_convolution.transformed.png new file mode 100644 index 00000000000000..07fb2213a9076e --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/img/model_fq_and_convolution.transformed.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:79b2fd14f9ff7655e4a5abe7e71748e153a095fe1f5eb07c168f53cb12fbb406 +size 216703 diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/img/model_fq_and_convolution.transformed.svg b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/img/model_fq_and_convolution.transformed.svg new file mode 100644 index 00000000000000..f1f18e7b94ee97 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/img/model_fq_and_convolution.transformed.svg @@ -0,0 +1 @@ +DequantizationINT8 Convolution with zero pointQuantizationFakeQuantizelevels: 256{u8} {1, 3, 299, 299}Parameter{f32} {1, 3, 299, 299}Subtract{f32} {1, 3, 299, 299}Multiply{f32} {1, 6, 299, 299}Convolution{f32} {1, 6, 299, 299}Constant{f32} {1, 1, 1, 1}value:[-12.8]Constant{f32} {1, 1, 1, 1}value:[12.7]Constant{f32} {1, 1, 1, 1}value:[0]Constant{f32} {1, 1, 1, 1}value:[255]Constant{i8} {6, 3, 1, 1}Constant{u8} {}Constant{f32} {1, 6, 1, 1}ResultSubtract{f32} {1, 3, 299, 299}Constant{i8} {6, 1, 1, 1}Zero point on activationsZero point on weights \ No newline at end of file diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/img/model_fq_fq_and_convolution.common.png b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/img/model_fq_fq_and_convolution.common.png new file mode 100644 index 00000000000000..e12e47a748b30f --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/img/model_fq_fq_and_convolution.common.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:4d3e9a9eddfdcd50eedb035c500848b982b9317ba23f28809a831bbe66300bec +size 167226 diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/img/model_fq_fq_and_convolution.common.svg b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/img/model_fq_fq_and_convolution.common.svg new file mode 100644 index 00000000000000..0505b70097fa83 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/img/model_fq_fq_and_convolution.common.svg @@ -0,0 +1 @@ +FP32 ConvolutionFakeQuantizeFakeQuantizelevels: 256{f32} {1, 3, 299, 299}Parameter{f32} {1, 3, 299, 299}Convolution{f32} {1, 6, 299, 299}Constant{f32} {1, 1, 1, 1}Value:[-12.8]Constant{f32} {1, 1, 1, 1}Value:[12.7]Constant{f32} {1, 1, 1, 1}Value:[-12.8]Constant{f32} {1, 1, 1, 1}Value:[12.7]ResultFakeQuantizelevels: 255{f32} {6, 3, 299, 299}Constant{i8} {6, 3, 1, 1}Constant{f32} {1, 1, 1, 1}Value:[-12.8]Constant{f32} {1, 1, 1, 1}Value:[12.7]Constant{f32} {1, 1, 1, 1}Value:[-12.8]Constant{f32} {1, 1, 1, 1}Value:[12.7] \ No newline at end of file diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/img/model_qdq_and_convolution.common.png b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/img/model_qdq_and_convolution.common.png new file mode 100644 index 00000000000000..e70b6f920e825c --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/img/model_qdq_and_convolution.common.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ec31aa62c0e1da3caf1531f2d92270f321857aca3044445ec242f33ee224f91b +size 297353 diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/img/model_qdq_and_convolution.common.svg b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/img/model_qdq_and_convolution.common.svg new file mode 100644 index 00000000000000..76ac5325a4f8eb --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/img/model_qdq_and_convolution.common.svg @@ -0,0 +1 @@ +DequantizationFP32 Convolution with quantized weightsQuantizationFakeQuantizelevels: 256{f32} {1, 3, 299, 299}Parameter{f32} {1, 3, 299, 299}Multiply{f32} {1, 3, 299, 299}Convolution{f32} {1, 6, 299, 299}Constant{f32} {1, 1, 1, 1}Value:[-12.8]Constant{f32} {1, 1, 1, 1}Value:[12.7]Constant{f32} {1, 1, 1, 1}Value:[0]Constant{f32} {1, 1, 1, 1}Value:[255]Constant{f32} {}ResultConvert{f32} {1, 3, 299, 299}Convert{u8} {1, 3, 299, 299}Subtract{f32} {1, 3, 299, 299}Constant{u8} {}Convert{f32} {}Dequantization on weightsConstant{i8} {6, 3, 1, 1}Multiply{f32} {6, 3, 1, 1}Convert{f32} {6, 3, 1, 1}Constant{f32} {6, 1, 1, 1}Subtract{f32} {6, 3, 1, 1}Constant{i8} {6, 1, 1, 1}Convert{f32} {6, 1, 1, 1} \ No newline at end of file diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/lpt.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/lpt.md new file mode 100644 index 00000000000000..0267e801004a48 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/lpt.md @@ -0,0 +1,319 @@ +# OpenVINOâ„¢ Low Precision Transformations {#openvino_docs_IE_DG_lpt} + +@sphinxdirective + +.. toctree:: + :maxdepth: 1 + :caption: Low Precision Transformations + :hidden: + + Low Precision Transformations + + Attributes + Step 1. Prerequisites transformations + Step 2. Markup transformations + Step 3. Main transformations + Step 4. Cleanup transformations + +@endsphinxdirective + +## Introduction +Low precision transformations (known as LPT) are a set of nGraph transformations, which are combined in one library. The library is mandatory part of OpenVINO to infer quantized model in low precision with the maximum performance on Intel CPU, GPU and ARM platforms. The library includes more than 45 transformations and supports more then 30 operations. Some transformations are mandatory, some of them are optional and developed for specific device. + +The goal of Low Precision Transformations (LPT) is to transform a quantized model from its original precision (FP16 or FP32) to a low precision (INT8: `signed int8` or `unsigned int8`), so that it is prepared for low precision inference in OpenVINOâ„¢ plugin. It is achieved by two main principles: +1. `FakeQuantize` operation decomposition to two parts: + - part #1: quantize operation - new `FakeQuantize` operation with output quantization intervals in low precision range (signed int8: [-128, 127] or [-127, 127], unsigned int8: [0, 255] or [0, 256]) and with low precision output (`signed int8` or `unsigned int8`), + - part #2: dequantization operations with low precision input and original precision output. +2. Propagation of the dequantization operation through original model's operations. It is done to avoid dequantization operations before original model operations, thus the quantize operations with low precision output remain before the original model operations. + +As result, operation input tensor precisions will be changed from original to low precision and operations can be inferred by OpenVINOâ„¢ plugin in low precision. + +For a more detailed description on how to quantize a model, see the [Low precision tools](#low-precision-tools) section below. For more information about model quantization, refer to **Brief History of Lower Precision in Deep Learning** section in [this whitepaper](https://software.intel.com/en-us/articles/lower-numerical-precision-deep-learning-inference-and-training). + +## Input model requirements + +LPT transformations propagate dequantization operations through the following operations: +* [Add-1](@ref openvino_docs_ops_arithmetic_Add_1) +* [AvgPool-1](@ref openvino_docs_ops_pooling_AvgPool_1) +* [Clamp-1](@ref openvino_docs_ops_activation_Clamp_1) +* [Concat-1](@ref openvino_docs_ops_movement_Concat_1) +* [Convolution-1](@ref openvino_docs_ops_convolution_Convolution_1) +* [ConvolutionBackpropData-1](@ref openvino_docs_ops_convolution_ConvolutionBackpropData_1) +* [DepthToSpace-1](@ref openvino_docs_ops_movement_DepthToSpace_1) +* [FakeQuantize-1](@ref openvino_docs_ops_quantization_FakeQuantize_1) +* [GroupConvolution-1](@ref openvino_docs_ops_convolution_GroupConvolution_1) +* [Interpolate-1](@ref openvino_docs_ops_image_Interpolate_1) +* [Interpolate-4](@ref openvino_docs_ops_image_Interpolate_4) +* [MatMul-1](@ref openvino_docs_ops_matrix_MatMul_1) +* [MaxPool-1](@ref openvino_docs_ops_pooling_MaxPool_1) +* [Multiply-1](@ref openvino_docs_ops_arithmetic_Multiply_1) +* [MVN-1](@ref openvino_docs_ops_normalization_MVN_1) +* [NormalizeL2-1](@ref openvino_docs_ops_normalization_NormalizeL2_1) +* [PRelu-1](@ref openvino_docs_ops_activation_PReLU_1) +* [ReduceMax-1](@ref openvino_docs_ops_reduction_ReduceMax_1) +* [ReduceMean-1](@ref openvino_docs_ops_reduction_ReduceMean_1) +* [ReduceMin-1](@ref openvino_docs_ops_reduction_ReduceMin_1) +* [ReduceSum-1](@ref openvino_docs_ops_reduction_ReduceSum_1) +* [Relu-1](@ref openvino_docs_ops_activation_ReLU_1) +* [Reshape-1](@ref openvino_docs_ops_shape_Reshape_1) +* [Split-1](@ref openvino_docs_ops_movement_Split_1) +* [Squeeze-1](@ref openvino_docs_ops_shape_Reshape_1) +* [StridedSlice-1](@ref openvino_docs_ops_movement_StridedSlice_1) +* [Transpose-1](@ref openvino_docs_ops_movement_Transpose_1) +* [Unsqueeze-1](@ref openvino_docs_ops_shape_Unsqueeze_1) +* [VariadicSplit-1](@ref openvino_docs_ops_movement_VariadicSplit_1) + +If operation is not supported by LPT then dequantization operation will not be propagated, input tensor precisions will not be changed to low precision and operation will be executed in original precision. + +For example, if you would like to infer a model with `Convolution` operation in low precision then the model can look as on picture below: + +![Quantized Convolution](img/model_fq_and_convolution.common.png) + +> There are several supported quantization approaches on activations and on weights. All supported approaches are described in [Quantization approaches](#quantization-approaches) section below. In demonstrated model [FakeQuantize operation quantization](#fakequantize-operation) approach is used. + +### Low precision tools +There are two tools to quantize a model: +1. [Post-Training Optimization Toolkit](@ref pot_docs_LowPrecisionOptimizationGuide) (POT) +2. [Neural Network Compression Framework](https://github.com/openvinotoolkit/nncf) (NNCF) + +Additionally, low precision transformations can handle ONNX quantized models. + +## Quantization approaches +LPT transformations support two quantization approaches: +1. `FakeQuantize` operation, +2. Quantize and dequantization operations + +Let's explore both approaches in details on `Convolution` operation. +### FakeQuantize operation +In this case `FakeQuantize` operation is used on activations and quantized constant on weights. Original input model: + +![Original model with FakeQuantize](img/model_fq_and_convolution.common.png) + +### Quantize and dequantization operations +In this case `FakeQuantize` operation and `Convert` are used as quantize operation and return quantized low precision tensor. After quantize operation on activations there are `Convert` and dequantization operations to compensate decomposition. Original input model: + +![Original model with Q/DQ](img/model_qdq_and_convolution.common.png) + +In both cases result is the same. In LPT result model you can see, that: +1. if necessary, `FakeQuantize` operations on activations were decomposed to two part: + - new `FakeQuantize`operation with updated output intervals in low precision range and low precision output, + - dequantization operations on activations; +2. if necessary, an existing `FakeQuantize` decomposition can be reworked to get better precision; +3. dequantization operations were propagated through `Convolution`. + +LPT result model: + +![Result model](img/model_fq_and_convolution.transformed.png) + +### Low precision transformations pipeline +LPT transformation pipeline has several steps. For each transformation inside one step pattern matcher is unique per transformation, but each operation can be assigned to several transformations. + +![Low precision transformations pipeline](img/low_precision_transformation_pipeline.png) + +Inside each step LPT transformations handle input model operation by operation, applying transformation matching pattern for each transformation from the step to an operation, and execute transformation if pattern is matched. Decomposition transformation decomposes `FakeQuantize` to quantize and dequantization operations. Dequantization operations from previous transformation result is used for the current one and so on, until the end of the model is achieved. + +As result, usually all operations are inferred by plugin in low precision. If plugin doesn't support an operation inference in low precision, then corresponding LPT transformation can be disabled, and input tensor precisions for the operation will not be changed. In this case the operation is inferred in the original precision. + +Low precision transformations pipeline includes four steps: +* [Step #1: Prerequisites](@ref openvino_docs_IE_DG_lpt_step1_prerequisites) +* [Step #2: Markup transformations](@ref openvino_docs_IE_DG_lpt_step2_markup) +* [Step #3: Main transformations](@ref openvino_docs_IE_DG_lpt_step3_main) +* [Step #4: Cleanup transformations](@ref openvino_docs_IE_DG_lpt_step4_cleanup) + +### Step 1. Prerequisites +This step fuses and propagates some operations in the model to prepare for the next step. It is required for OpenVINO plugins. Transformations: +* [PullReshapeThroughDequantization](@ref openvino_docs_IE_DG_lpt_PullReshapeThroughDequantization) +* [PullTransposeThroughDequantization](@ref openvino_docs_IE_DG_lpt_PullTransposeThroughDequantization) +* [LinOpSequenceFusion](@ref openvino_docs_IE_DG_lpt_LinOpSequenceFusion) + +The model on this step is changed. There are more details in developer guide [Prerequisites transformations](@ref openvino_docs_IE_DG_lpt_step1_prerequisites). + +### Step 2. Markup +This step creates runtime attributes for operations. These attributes will be used in next step. Transformations: +* [MarkupCanBeQuantized](@ref openvino_docs_IE_DG_lpt_MarkupCanBeQuantized) +* [MarkupPrecisions](@ref openvino_docs_IE_DG_lpt_MarkupPrecisions) +* [MarkupPerTensorQuantization](@ref openvino_docs_IE_DG_lpt_MarkupPerTensorQuantization) +* [MarkupAvgPoolPrecisionPreserved](@ref openvino_docs_IE_DG_lpt_MarkupAvgPoolPrecisionPreserved) +* [PropagatePrecisions](@ref openvino_docs_IE_DG_lpt_PropagatePrecisions) +* [AlignQuantizationIntervals](@ref openvino_docs_IE_DG_lpt_AlignQuantizationIntervals) +* [AlignQuantizationParameters](@ref openvino_docs_IE_DG_lpt_AlignQuantizationParameters) + +The model on this step is changed: only new attributes are added to some operations. There are more details in developer guide [Markup transformations](@ref openvino_docs_IE_DG_lpt_step2_markup). + +### Step 3. Main transformations, FakeQuantize decomposition and dequantization operations handling +This step has the most transformations. These transformations can be separated in two groups: decomposition transformation and dequantization operations handling. There are more details in developer guide [Main transformations](@ref openvino_docs_IE_DG_lpt_step3_main). Transformations: +* [AddTransformation](@ref openvino_docs_IE_DG_lpt_AddTransformation) +* [AvgPoolTransformation](@ref openvino_docs_IE_DG_lpt_AvgPoolTransformation) +* [ClampTransformation](@ref openvino_docs_IE_DG_lpt_AvgPoolTransformation) +* [ConcatTransformation](@ref openvino_docs_IE_DG_lpt_ConcatTransformation) +* [ConvolutionTransformation](@ref openvino_docs_IE_DG_lpt_ConvolutionTransformation) +* [ConvolutionBackpropDataTransformation](@ref openvino_docs_IE_DG_lpt_ConvolutionBackpropDataTransformation) +* [DepthToSpaceTransformation](@ref openvino_docs_IE_DG_lpt_DepthToSpaceTransformation) +* [FakeQuantizeDecompositionTransformation](@ref openvino_docs_IE_DG_lpt_FakeQuantizeDecompositionTransformation) +* [FakeQuantizeTransformation](@ref openvino_docs_IE_DG_lpt_FakeQuantizeTransformation) +* [InterpolateTransformation](@ref openvino_docs_IE_DG_lpt_InterpolateTransformation) +* [GroupConvolutionTransformation](@ref openvino_docs_IE_DG_lpt_GroupConvolutionTransformation) +* [MatMulTransformation](@ref openvino_docs_IE_DG_lpt_MatMulTransformation) +* [MaxPoolTransformation](@ref openvino_docs_IE_DG_lpt_MaxPoolTransformation) +* [MultiplyTransformation](@ref openvino_docs_IE_DG_lpt_MultiplyTransformation) +* [MVNTransformation](@ref openvino_docs_IE_DG_lpt_MVNTransformation) +* [NormalizeL2Transformation](@ref openvino_docs_IE_DG_lpt_NormalizeL2Transformation) +* [PReluTransformation](@ref openvino_docs_IE_DG_lpt_PReluTransformation) +* [ReduceMaxTransformation](@ref openvino_docs_IE_DG_lpt_ReduceMaxTransformation) +* [ReduceMeanTransformation](@ref openvino_docs_IE_DG_lpt_ReduceMeanTransformation) +* [ReduceMinTransformation](@ref openvino_docs_IE_DG_lpt_ReduceMinTransformation) +* [ReduceSumTransformation](@ref openvino_docs_IE_DG_lpt_ReduceSumTransformation) +* [ReluTransformation](@ref openvino_docs_IE_DG_lpt_ReluTransformation) +* [ReshapeTransformation](@ref openvino_docs_IE_DG_lpt_ReshapeTransformation) +* [SqueezeTransformation](@ref openvino_docs_IE_DG_lpt_SqueezeTransformation) +* [ShuffleChannelsTransformation](@ref openvino_docs_IE_DG_lpt_ShuffleChannelsTransformation) +* [SplitTransformation](@ref openvino_docs_IE_DG_lpt_SplitTransformation) +* [StridedSliceTransformation](@ref openvino_docs_IE_DG_lpt_StridedSliceTransformation) +* [TransposeTransformation](@ref openvino_docs_IE_DG_lpt_TransposeTransformation) +* [UnsqueezeTransformation](@ref openvino_docs_IE_DG_lpt_UnsqueezeTransformation) +* [VariadicSplitTransformation](@ref openvino_docs_IE_DG_lpt_VariadicSplitTransformation) + +#### Decomposition transformations +Decomposition transformations decompose the `FakeQuantize` operation to: quantize (`FakeQuantize` with low precision output) and dequantization operations (opposite to quantize, with low precision input and the original precision output). For dequantization operations LPT uses three operations: `Convert`, `Subtract` and `Multiply`. Element-wise operations `Subtract` and `Multiply` have constants on the second branches. If dequantization operations are not handled at the end of LPT pipeline, then they will be fused back to the `FakeQuantize`. + + +Original `FakeQuantize`: +![FakeQuantize operation before LPT](quantization/img/fq.common.png) + + +`FakeQuantize` after decomposition to quantization and dequantization operations: +![FakeQuantize operation after LPT](quantization/img/fq.transformed.png) + + +#### Dequantization operations handling transformations + +In this step, LPT transformations fuse dequantization operations or move them through existing model operations as much as possible. + +Original `Convolution` operation in FP32 with dequantization operations before: +![Convolution operation before LPT](img/model_fq_and_convolution.common.png) + +`Convolution` operation in INT8 after decomposition and dequantization operations handling: +![Convolution operation after LPT](img/model_fq_and_convolution.transformed.png) + +### Step 4: Cleanup of the result model +LPT cleanup transformations is final stage in LPT pipeline. In this step LPT transformations clean up the result model to avoid not handled dequantization operations: fuse dequantization operations if possible (fuse at least `Convert` operations if not) to other model operations to cleanup result model. Transformations: +* [FoldConvertTransformation](@ref openvino_docs_IE_DG_lpt_FoldConvertTransformation) +* [FoldFakeQuantizeTransformation](@ref openvino_docs_IE_DG_lpt_FoldFakeQuantizeTransformation) +* [FuseConvertTransformation](@ref openvino_docs_IE_DG_lpt_FuseConvertTransformation) +* [FuseMultiplyToFakeQuantizeTransformation](@ref openvino_docs_IE_DG_lpt_FuseMultiplyToFakeQuantizeTransformation) +* [FuseSubtractToFakeQuantizeTransformation](@ref openvino_docs_IE_DG_lpt_FuseSubtractToFakeQuantizeTransformation) +* [MultiplyToGroupConvolutionTransformation](@ref openvino_docs_IE_DG_lpt_MultiplyToGroupConvolutionTransformation) + +There are more details in developer guide [Cleanup transformations](@ref openvino_docs_IE_DG_lpt_step4_cleanup). + +`FakeQuantize` operation with not handled dequantization operations: +![TODO: FakeQuantize operation with dequantization operations before LPT](quantization/img/fq.transformed.png) + +`FakeQuantize` operation with fused dequantization operations: +![TODO: FakeQuantize operation with fused operations after LPT](quantization/img/fq.common.png) + + + +## Low precision transformations in plugin transformation pipeline +Typical transformation pipeline described below. + +### Step 1. Common optimizations +This step is optional for LPT but typically is presented in OpenVINOâ„¢ plugins. The step doesn't use any LPT transformation. Firstly, the step disables dequantization operations constant folding on constant subgraph on weights to prevent the lost of dequantization info on the next plugin transformations. After that, it optimizes nGraph function and convert operations to operation set 1. Typically, usage of this step is the simplest way to meet LPT requirements for the input quantized model. If plugin can guarantee that LPT input requirements are met, then this step can be skipped. + +@snippet snippets/lpt_mkldnn_plugin.cpp lpt_common + +### Step 2. Low precision transformations execution +This step is mandatory. It configures and runs LPT transformations. + +@snippet snippets/lpt_mkldnn_plugin.cpp lpt_execution + +### Step 3. Plugin-specific transformations +This step is optional. It modifies the nGraph function to a device-specific operation set. + +@snippet snippets/lpt_mkldnn_plugin.cpp lpt_device + +## Result model overview + +Let's explore quantized [TensorFlow* implementation of ResNet-50](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/resnet-50-tf) model. Use [Model Downloader](@ref omz_tools_downloader) tool to download the `fp16` model from [OpenVINOâ„¢ Toolkit - Open Model Zoo repository](https://github.com/openvinotoolkit/open_model_zoo): +```sh +./downloader.py --name resnet-50-tf --precisions FP16-INT8 +``` +After that you should quantize model by the [Model Quantizer](@ref omz_tools_downloader) tool. +```sh +./quantizer.py --model_dir public/resnet-50-tf --dataset_dir --precisions=FP16-INT8 +``` + +### Inference + +The simplest way to infer the model and collect performance counters is [Benchmark Application](../../../../samples/cpp/benchmark_app/README.md). +```sh +./benchmark_app -m resnet-50-tf.xml -d CPU -niter 1 -api sync -report_type average_counters -report_folder pc_report_dir +``` +If you infer the model with the OpenVINOâ„¢ CPU plugin and collect performance counters, all operations (except last not quantized SoftMax) are executed in INT8 precision. + +### Results analysis + +Result model depends on different factors: +* The original model quantization possibility and quantization quality. For some models, some operations are not possible to be quantized by POT and NNCF tools. In this case `FakeQuantize` operations are absent before these operations and they will be inferred in original precision. +* LPT customization and plugin supported operations. If plugin doesn't support INT8 inference for some operation then corresponding LPT transformation should be disabled and the operation will be inferred in original precision. + + +Information about layer precision is stored in the performance counters that are +available from the Inference Engine API. For example, the part of performance counters table for quantized [TensorFlow* implementation of ResNet-50](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/resnet-50-tf) model inference on CPU Plugin looks as follows: + + +| layerName | execStatus | layerType | execType | realTime (ms) | cpuTime (ms) | +| --------------------------------------------------------- | ---------- | ------------ | -------------------- | ------------- | ------------ | +| resnet\_model/batch\_normalization\_15/FusedBatchNorm/Add | EXECUTED | Convolution | jit\_avx512\_1x1\_I8 | 0.377 | 0.377 | +| resnet\_model/conv2d\_16/Conv2D/fq\_input\_0 | NOT\_RUN | FakeQuantize | undef | 0 | 0 | +| resnet\_model/batch\_normalization\_16/FusedBatchNorm/Add | EXECUTED | Convolution | jit\_avx512\_I8 | 0.499 | 0.499 | +| resnet\_model/conv2d\_17/Conv2D/fq\_input\_0 | NOT\_RUN | FakeQuantize | undef | 0 | 0 | +| resnet\_model/batch\_normalization\_17/FusedBatchNorm/Add | EXECUTED | Convolution | jit\_avx512\_1x1\_I8 | 0.399 | 0.399 | +| resnet\_model/add\_4/fq\_input\_0 | NOT\_RUN | FakeQuantize | undef | 0 | 0 | +| resnet\_model/add\_4 | NOT\_RUN | Eltwise | undef | 0 | 0 | +| resnet\_model/add\_5/fq\_input\_1 | NOT\_RUN | FakeQuantize | undef | 0 | 0 | + + +> The `execStatus` column of the table includes possible values: +> - `EXECUTED` - layer was executed by standalone primitive, +> - `NOT_RUN` - layer was not executed by standalone primitive or was fused with another operation and executed in another layer primitive. +> +> The `execType` column of the table includes inference primitives with specific suffixes. The layers have the following marks: +> * Suffix `I8` for layers that had 8-bit data type input and were computed in 8-bit precision +> * Suffix `FP32` for layers computed in 32-bit precision + +As result all operations (except not quantized `SoftMax` at the end of the model) in OpenVINOâ„¢ CPU plugin are inferred in low precision. Note, please, in the result model there are `FakeQuantize` operations in FP32 but the plugin responsibility is fuse these operations with previous operations. OpenVINOâ„¢ CPU plugin achieves maximum optimized inference for all operations by fusing INT8 `Convolution` with FP32 output with `FakeQuantize` operation with FP32 input and INT8 output. In this case OpenVINOâ„¢ CPU plugin uses INT8 and FP32 vectorized instructions but reports about one INT8 kernel usage for inference, which is the most optimized for this case. + +## Mixed precision +If LPT input model operation output has `fp16` precision then dequantization computations still occurs in `fp32` precision. This approach is used to avoid accuracy loss in `fp16` arithmetic computations. Note, the latest dequantization operation output has `fp16` precision. + +## Customization +Low Precision Transformations can be customizable. Build-in customization options: +* operation precision restrictions, +* operation per tensor quantization restrictions, +* update precisions, +* dequantization precision. + + +### Operation precision restrictions +This option defines precisions which allowed for the operation input ports. The option value is passed as input argument for `LowPrecision` constructor. For example: + +@snippet snippets/lpt_mkldnn_plugin.cpp lpt_supported_precisions + +In provided example in result model `Convolution` operation inputs must have specific precisions: `u8` (unsigned int8) precision on input 0 (on activations) and `i8` (signed int8) precision on input 1 (on weights). + +### Operation per tensor quantization restrictions +This option defines if operation supports per-tensor quantization only. The option value is passed as input argument for `LowPrecision` constructor. For example: + +@snippet snippets/lpt_mkldnn_plugin.cpp per_tensor_quantization + +In provided example in result model `Convolution` operations must have per-tensor quantization on input 0 (on activations). + +### Update precisions +This option defines if each LPT transformation updates precision or not. The option value is boolean and is passed as `updatePrecisions` member of `LayerTransformation::Params` which is input argument for `LowPrecision` constructor. All transformations are affected. If `true` then low precision transformations update precisions to low precision and doesn't if `false`. Typically this option is used for plugin debugging. + +### Typical customization use cases + +Plugin specific customization can be implemented via nGraph transformation callbacks. For example: asymmetric quantization support can be easily customizable via `LayerTransformation::isAsymmetricQuantization` and `WeightableLayerTransformation::isAsymmetricOnWeights` methods usage in callbacks. For example: + +@snippet snippets/lpt_mkldnn_plugin.cpp asymmetric_quantization diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/lpt_attributes.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/lpt_attributes.md new file mode 100644 index 00000000000000..ce567c746e717e --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/lpt_attributes.md @@ -0,0 +1,56 @@ +# Attributes {#openvino_docs_IE_DG_lpt_attributes} + +@sphinxdirective + +.. toctree:: + :maxdepth: 1 + :caption: Attributes + :hidden: + + AvgPoolPrecisionPreserved + IntervalsAlignment + PerTensorQuantization + PrecisionPreserved + Precisions + QuantizationAlignment + +@endsphinxdirective + +## Introduction + +| Name | Target | Required | Mutable | +|-------------------------------------------------------------------------------------|------------------------|----------|---------| +| [AvgPoolPrecisionPreserved](@ref openvino_docs_IE_DG_lpt_AvgPoolPrecisionPreserved) | Precision | No | Yes | +| [IntervalsAlignment](@ref openvino_docs_IE_DG_lpt_IntervalsAlignment) | Quantization interval | Yes | Yes | +| [PerTensorQuantization](@ref openvino_docs_IE_DG_lpt_PerTensorQuantization) | Precision | Yes | No | +| [PrecisionPreserved](@ref openvino_docs_IE_DG_lpt_PrecisionPreserved) | Precision | Yes | Yes | +| [Precisions](@ref openvino_docs_IE_DG_lpt_Precisions) | Precision | Yes | Yes | +| [QuantizationAlignment](@ref openvino_docs_IE_DG_lpt_QuantizationAlignment) | Quantization alignment | Yes | Yes | + +> `Target` attribute group defines attribute usage during model transformation for the best performance: +> - `Precision` - the attribute defines the most optimal output port precision. +> - `Quantization interval` - the attribute defines quantization interval. +> - `Quantization alignment` - the attribute defines quantization alignment: per-channel or per-tensor quantization. +> +> `Required` attribute group defines if attribute usage is required to get an optimal model during transformation: +> - `Yes` - the attribute is used by all OpenVINO plugins for low-precision optimization. +> - `No` - the attribute is used in a specific OpenVINO plugin. +> +> `Mutable` attribute group defines if transformation can update an existing attribute: +> - `Yes` - the attribute can be updated by the next transformations in the pipeline. But attribute update order is still important. +> - `No` - existing attribute can not be updated by the next transformation. Previous handled transformation has optimized a model according to the current value. + +`FakeQuantize` decomposition is a mandatory part of low precision transformations. Attributes used during decomposition are mandatory. Optional attributes are required only for certain operations. + +Attributes usage by transformations: + +| Attribute name | Created by transformations | Used by transformations | +|---------------------------|---------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------| +| PrecisionPreserved | MarkupPrecisions, MarkupAvgPoolPrecisionPreserved | AlignQuantizationIntervals, AlignQuantizationParameters, FakeQuantizeDecompositionTransformation, MarkupAvgPoolPrecisionPreserved | +| AvgPoolPrecisionPreserved | MarkupAvgPoolPrecisionPreserved | | +| Precisions | MarkupCanBeQuantized, MarkupPrecisions | FakeQuantizeDecompositionTransformation | +| PerTensorQuantization | MarkupPerTensorQuantization | | +| IntervalsAlignment | AlignQuantizationIntervals | FakeQuantizeDecompositionTransformation | +| QuantizationAlignment | AlignQuantizationParameters | FakeQuantizeDecompositionTransformation | + +> **Note:** the same type of attribute instances can be created in different transformations. This approach is the result of the transformation single-responsibility principle. For example, `Precision` attribute instances are created in `MarkupCanBeQuantized` and `MarkupPrecisions` transformations, but the reasons for their creation are different. \ No newline at end of file diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step2_markup1.png b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step2_markup1.png new file mode 100644 index 00000000000000..813625f420b01d --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step2_markup1.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3a79d152dae50fd3afaa78d8e18de7d279bb1c79b3e4d5c68fffed52a7c51b18 +size 383875 diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step2_markup1.svg b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step2_markup1.svg new file mode 100644 index 00000000000000..21359eda17aea3 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step2_markup1.svg @@ -0,0 +1 @@ +FakeQuantizename: fakeQuantize1levels: 256{f32} {1, 3, 299, 299}Parametername: input1{f32} {1, 3, 299, 299}Constant{f32} {1, 1, 1, 1}value: [-0.42667]Constant{f32} {1, 1, 1, 1}value: [0.42333]Constant{f32} {1, 1, 1, 1}value: [-0.42667]Constant{f32} {1, 1, 1, 1}value: [0.42333]ResultConcatname: concat1{f32} {1, 6, 299, 299}FakeQuantizename: fakeQuantize2levels: 256{f32} {1, 3, 299, 299}Parametername: input2{f32} {1, 3, 299, 299}Constant{f32} {1, 1, 1, 1}value: [-0.64]Constant{f32} {1, 1, 1, 1}value: [0.635]Constant{f32} {1, 1, 1, 1}value: [-0.64]Constant{f32} {1, 1, 1, 1}value: [0.635]FakeQuantizename: fakeQuantize3levels: 256{f32} {1, 3, 299, 299}Parametername: input3{f32} {1, 3, 299, 299}Constant{f32} {1, 1, 1, 1}value: -1.28]Constant{f32} {1, 1, 1, 1}value: [12.7]Constant{f32} {1, 1, 1, 1}value: [-1.28]Constant{f32} {1, 1, 1, 1}value: [1.27]ResultConcatname: concat2{f32} {1, 6, 299, 299}AvgPoolname: maxPool{f32} {1, 6, 299, 299}Convolutionname: convolution2{f32} {1, 9, 299, 299}Constant{i8} {9, 6, 1, 1}Dequantization on weightsMultiply{f32} {9, 6, 1, 1}Convert{f32} {9, 6, 1, 1}Constant{f32} {9, 1, 1, 1}Subtract{f32} {9, 6, 1, 1}Constant{i8} {9, 1, 1, 1}Convert{f32} {9, 1, 1, 1}Convolutionname: convolution1in0: Precisions {precisions: {}}In1: Precisions {precisions: {}}{f32} {1, 9, 299, 299}Constant{f32} {9, 6, 1, 1} \ No newline at end of file diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step2_markup2.png b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step2_markup2.png new file mode 100644 index 00000000000000..a6ac9efadabd36 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step2_markup2.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d54234622f538249dd5ccb5156cc10dd9b5bb40e800f6d1d906a0ff44ecabcf4 +size 388893 diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step2_markup2.svg b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step2_markup2.svg new file mode 100644 index 00000000000000..d8d323becca2db --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step2_markup2.svg @@ -0,0 +1 @@ +FakeQuantizename: fakeQuantize1levels: 256{f32} {1, 3, 299, 299}Parametername: input1{f32} {1, 3, 299, 299}Constant{f32} {1, 1, 1, 1}value: [-0.42667]Constant{f32} {1, 1, 1, 1}value: [0.42333]Constant{f32} {1, 1, 1, 1}value: [-0.42667]Constant{f32} {1, 1, 1, 1}value: [0.42333]ResultConcatname: concat1rt info: PrecisionPreserved{value: true}{f32} {1, 6, 299, 299}FakeQuantizename: fakeQuantize2levels: 256{f32} {1, 3, 299, 299}Parametername: input2{f32} {1, 3, 299, 299}Constant{f32} {1, 1, 1, 1}value: [-0.64]Constant{f32} {1, 1, 1, 1}value: [0.635]Constant{f32} {1, 1, 1, 1}value: [-0.64]Constant{f32} {1, 1, 1, 1}value: [0.635]FakeQuantizename: fakeQuantize3levels: 256{f32} {1, 3, 299, 299}Parametername: input3{f32} {1, 3, 299, 299}Constant{f32} {1, 1, 1, 1}value: -1.28]Constant{f32} {1, 1, 1, 1}value: [12.7]Constant{f32} {1, 1, 1, 1}value: [-1.28]Constant{f32} {1, 1, 1, 1}value: [1.27]ResultConcatname: concat2rt info: PrecisionPreserved{value: true}{f32} {1, 6, 299, 299}AvgPoolname: maxPool{f32} {1, 6, 299, 299}Convolutionname: convolution2in0: Precisions {precisions: {u8}}In1: Precisions {precisions: {i8}}{f32} {1, 9, 299, 299}Constant{i8} {9, 6, 1, 1}Dequantization on weightsMultiply{f32} {9, 6, 1, 1}Convert{f32} {9, 6, 1, 1}Constant{f32} {9, 1, 1, 1}Subtract{f32} {9, 6, 1, 1}Constant{i8} {9, 1, 1, 1}Convert{f32} {9, 1, 1, 1}Convolutionname: convolution1in0: Precisions {precisions: {}}In1: Precisions {precisions: {}}{f32} {1, 9, 299, 299}Constant{f32} {9, 6, 1, 1} \ No newline at end of file diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step2_markup3.png b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step2_markup3.png new file mode 100644 index 00000000000000..cdf276757ed5f2 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step2_markup3.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3132bad01388adf7f788592538194bceb6b94f76f1c3788ffb73b76b19a74990 +size 393300 diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step2_markup3.svg b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step2_markup3.svg new file mode 100644 index 00000000000000..80f3f0dea20625 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step2_markup3.svg @@ -0,0 +1 @@ +FakeQuantizename: fakeQuantize1levels: 256{f32} {1, 3, 299, 299}Parametername: input1{f32} {1, 3, 299, 299}Constant{f32} {1, 1, 1, 1}value: [-0.42667]Constant{f32} {1, 1, 1, 1}value: [0.42333]Constant{f32} {1, 1, 1, 1}value: [-0.42667]Constant{f32} {1, 1, 1, 1}value: [0.42333]ResultConcatname: concat1rt info: PrecisionPreserved{value: true}{f32} {1, 6, 299, 299}FakeQuantizename: fakeQuantize2levels: 256{f32} {1, 3, 299, 299}Parametername: input2{f32} {1, 3, 299, 299}Constant{f32} {1, 1, 1, 1}value: [-0.64]Constant{f32} {1, 1, 1, 1}value: [0.635]Constant{f32} {1, 1, 1, 1}value: [-0.64]Constant{f32} {1, 1, 1, 1}value: [0.635]FakeQuantizename: fakeQuantize3levels: 256{f32} {1, 3, 299, 299}Parametername: input3{f32} {1, 3, 299, 299}Constant{f32} {1, 1, 1, 1}value: -1.28]Constant{f32} {1, 1, 1, 1}value: [12.7]Constant{f32} {1, 1, 1, 1}value: [-1.28]Constant{f32} {1, 1, 1, 1}value: [1.27]ResultConcatname: concat2rt info: PrecisionPreserved{value: true}{f32} {1, 6, 299, 299}AvgPoolname: maxPool{f32} {1, 6, 299, 299}Convolutionname: convolution2in0: PerTensorQuantization, Precisions {precisions: {u8}}in1: Precisions {precisions: {i8}}{f32} {1, 9, 299, 299}Constant{i8} {9, 6, 1, 1}Dequantization on weightsMultiply{f32} {9, 6, 1, 1}Convert{f32} {9, 6, 1, 1}Constant{f32} {9, 1, 1, 1}Subtract{f32} {9, 6, 1, 1}Constant{i8} {9, 1, 1, 1}Convert{f32} {9, 1, 1, 1}Convolutionname: convolution1in0: PerTensorQuantization, Precisions {precisions: {}}in1: Precisions {precisions: {}}{f32} {1, 9, 299, 299}Constant{f32} {9, 6, 1, 1} \ No newline at end of file diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step2_markup4.png b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step2_markup4.png new file mode 100644 index 00000000000000..f3164acd1008f6 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step2_markup4.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:4f5a98e0ae8dc1f21dd0458ad9ed61de68b134e1128279c3e8b4e700ff3648f8 +size 398967 diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step2_markup4.svg b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step2_markup4.svg new file mode 100644 index 00000000000000..60ecb5f9673fef --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step2_markup4.svg @@ -0,0 +1 @@ +FakeQuantizename: fakeQuantize1levels: 256{f32} {1, 3, 299, 299}Parametername: input1{f32} {1, 3, 299, 299}Constant{f32} {1, 1, 1, 1}value: [-0.42667]Constant{f32} {1, 1, 1, 1}value: [0.42333]Constant{f32} {1, 1, 1, 1}value: [-0.42667]Constant{f32} {1, 1, 1, 1}value: [0.42333]ResultConcatname: concat1rt info: PrecisionPreserved{value: true}{f32} {1, 6, 299, 299}FakeQuantizename: fakeQuantize2levels: 256{f32} {1, 3, 299, 299}Parametername: input2{f32} {1, 3, 299, 299}Constant{f32} {1, 1, 1, 1}value: [-0.64]Constant{f32} {1, 1, 1, 1}value: [0.635]Constant{f32} {1, 1, 1, 1}value: [-0.64]Constant{f32} {1, 1, 1, 1}value: [0.635]FakeQuantizename: fakeQuantize3levels: 256{f32} {1, 3, 299, 299}Parametername: input3{f32} {1, 3, 299, 299}Constant{f32} {1, 1, 1, 1}value: -1.28]Constant{f32} {1, 1, 1, 1}value: [12.7]Constant{f32} {1, 1, 1, 1}value: [-1.28]Constant{f32} {1, 1, 1, 1}value: [1.27]ResultConcatname: concat2rt info: PrecisionPreserved{value: true}{f32} {1, 6, 299, 299}AvgPoolname: maxPoolrt info: AvgPoolPrecisionPreserved{value: true}, PrecisionPreserved{value: true}{f32} {1, 6, 299, 299}Convolutionname: convolution2in0: PerTensorQuantization, Precisions{precisions: {u8}}In1: Precisions{precisions: {i8}}{f32} {1, 9, 299, 299}Constant{i8} {9, 6, 1, 1}Dequantization on weightsMultiply{f32} {9, 6, 1, 1}Convert{f32} {9, 6, 1, 1}Constant{f32} {9, 1, 1, 1}Subtract{f32} {9, 6, 1, 1}Constant{i8} {9, 1, 1, 1}Convert{f32} {9, 1, 1, 1}Convolutionname: convolution1in0: PerTensorQuantization, Precisions{precisions: {}}In1: Precisions{precisions: {}}{f32} {1, 9, 299, 299}Constant{f32} {9, 6, 1, 1} \ No newline at end of file diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step2_markup5.png b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step2_markup5.png new file mode 100644 index 00000000000000..1523120717621c --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step2_markup5.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2618a80fd1be4d25dfc1f7e57e046a7844c9933a6fed316a0660c3051325557e +size 474998 diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step2_markup5.svg b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step2_markup5.svg new file mode 100644 index 00000000000000..358a3ceb5c6f44 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step2_markup5.svg @@ -0,0 +1 @@ +FakeQuantizename: fakeQuantize1levels: 256{f32} {1,3,299,299} Precisions {precisions: {u8}}Parametername: input1{f32} {1, 3, 299, 299}Constant{f32} {1, 1, 1, 1}value: [-0.42667]Constant{f32} {1, 1, 1, 1}value: [0.42333]Constant{f32} {1, 1, 1, 1}value: [-0.42667]Constant{f32} {1, 1, 1, 1}value: [0.42333]ResultConcatname: concat1rt info: PrecisionPreserved{value: true}, Precisions {precisions: {u8}{f32} {1, 6, 299, 299}FakeQuantizename: fakeQuantize2levels: 256{f32} {1, 3, 299, 299} Precisions {precisions: {u8}}Parametername: input2{f32} {1, 3, 299, 299}Constant{f32} {1, 1, 1, 1}value: [-0.64]Constant{f32} {1, 1, 1, 1}value: [0.635]Constant{f32} {1, 1, 1, 1}value: [-0.64]Constant{f32} {1, 1, 1, 1}value: [0.635]FakeQuantizename: fakeQuantize3levels: 256{f32} {1, 3, 299, 299} Precisions {precisions: {u8}}Parametername: input3{f32} {1, 3, 299, 299}Constant{f32} {1, 1, 1, 1}value: -1.28]Constant{f32} {1, 1, 1, 1}value: [12.7]Constant{f32} {1, 1, 1, 1}value: [-1.28]Constant{f32} {1, 1, 1, 1}value: [1.27]ResultConcatname: concat2rt info: PrecisionPreserved{value: true}, Precisions {precisions: {u8}{f32} {1, 6, 299, 299}AvgPoolname: maxPoolrt info: AvgPoolPrecisionPreserved{value: true}, PrecisionPreserved{value: true}, Precisions {precisions: {u8}}{f32} {1, 6, 299, 299}Convolutionname: convolution2in0: PerTensorQuantization, Precisions {precisions: {u8}}in1: Precisions {precisions: {i8}}{f32} {1, 9, 299, 299}Constant{i8} {9, 6, 1, 1}Dequantization on weightsMultiply{f32} {9, 6, 1, 1}Convert{f32} {9, 6, 1, 1}Constant{f32} {9, 1, 1, 1}Subtract{f32} {9, 6, 1, 1}Constant{i8} {9, 1, 1, 1}Convert{f32} {9, 1, 1, 1}Convolutionname: convolution1in0: PerTensorQuantization, Precisions {precisions: {}}in1: Precisions {precisions: {}}{f32} {1, 9, 299, 299}Constant{f32} {9, 6, 1, 1} \ No newline at end of file diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step2_markup6.png b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step2_markup6.png new file mode 100644 index 00000000000000..00a33774ce6699 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step2_markup6.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3b7750b3424540912ec590aa5b56cba9e4f2f9db6d45c23aed1d78d094321230 +size 488940 diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step2_markup6.svg b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step2_markup6.svg new file mode 100644 index 00000000000000..c8834585723660 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step2_markup6.svg @@ -0,0 +1 @@ +FakeQuantizename: fakeQuantize1levels: 256rt info: IntervalsAlignment{combined: { -1.28, 1.27 }, preferablePrecisions: {i8}}{f32} {1,3,299,299} Precisions{precisions: {u8}}Parametername: input1{f32} {1, 3, 299, 299}Constant{f32} {1, 1, 1, 1}value: [-0.42667]Constant{f32} {1, 1, 1, 1}value: [0.42333]Constant{f32} {1, 1, 1, 1}value: [-0.42667]Constant{f32} {1, 1, 1, 1}value: [0.42333]ResultConcatname: concat1rt info: IntervalsAlignment{combined: { -1.28, 1.27 }, preferablePrecisions: {i8}}, PrecisionPreserved{value: true}, Precisions {precisions: {u8}{f32} {1, 6, 299, 299}FakeQuantizename: fakeQuantize2levels: 256rt info: IntervalsAlignment{combined: { -1.28, 1.27 }, preferablePrecisions: {i8}}{f32} {1, 3, 299, 299} Precisions{precisions: {u8}}Parametername: input2{f32} {1, 3, 299, 299}Constant{f32} {1, 1, 1, 1}value: [-0.64]Constant{f32} {1, 1, 1, 1}value: [0.635]Constant{f32} {1, 1, 1, 1}value: [-0.64]Constant{f32} {1, 1, 1, 1}value: [0.635]FakeQuantizename: fakeQuantize3levels: 256rt info: IntervalsAlignment{combined: { -1.28, 1.27 }, preferablePrecisions: {i8}}{f32} {1, 3, 299, 299} Precisions{precisions: {u8}}Parametername: input3{f32} {1, 3, 299, 299}Constant{f32} {1, 1, 1, 1}value: -1.28]Constant{f32} {1, 1, 1, 1}value: [12.7]Constant{f32} {1, 1, 1, 1}value: [-1.28]Constant{f32} {1, 1, 1, 1}value: [1.27]ResultConcatname: concat2rt info: IntervalsAlignment{combined: { -1.28, 1.27 }, preferablePrecisions: {i8}}, PrecisionPreserved{value: true},Precisions{precisions: {u8}{f32} {1, 6, 299, 299}AvgPoolname: maxPoolrt info: IntervalsAlignment{combined: { -1.28, 1.27 }, preferablePrecisions: {i8}},AvgPoolPrecisionPreserved{value: true}, PrecisionPreserved{value: true}, Precisions{precisions: {u8}{f32} {1, 6, 299, 299}Convolutionname: convolution2in0: PerTensorQuantization, Precisions{precisions: {u8}}In1: Precisions {precisions: {i8}}{f32} {1, 9, 299, 299}Constant{i8} {9, 6, 1, 1}Dequantization on weightsMultiply{f32} {9, 6, 1, 1}Convert{f32} {9, 6, 1, 1}Constant{f32} {9, 1, 1, 1}Subtract{f32} {9, 6, 1, 1}Constant{i8} {9, 1, 1, 1}Convert{f32} {9, 1, 1, 1}Convolutionname: convolution1in0: PerTensorQuantization, Precisions{precisions: {}}In1: Precisions {precisions: {}}{f32} {1, 9, 299, 299}Constant{f32} {9, 6, 1, 1} \ No newline at end of file diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step2_markup7.png b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step2_markup7.png new file mode 100644 index 00000000000000..2724d138642391 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step2_markup7.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7836c25a0db5a5f08adf5539fb5ee29f52bc7923148dc42f4c78d3354b7b8464 +size 520539 diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step2_markup7.svg b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step2_markup7.svg new file mode 100644 index 00000000000000..625792de5c0f94 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step2_markup7.svg @@ -0,0 +1 @@ +FakeQuantizename: fakeQuantize1rt info: IntervalsAlignment{combined: { -1.28, 1.27 }, preferablePrecisions: {i8}}{f32} {1, 3, 299, 299}Precisions {precisions: {u8}}Parametername: input1{f32} {1, 3, 299, 299}Constant{f32} {1, 1, 1, 1}value: [-0.42667]Constant{f32} {1, 1, 1, 1}value: [0.42333]Constant{f32} {1, 1, 1, 1}value: [-0.42667]Constant{f32} {1, 1, 1, 1}value: [0.42333]ResultConcatname: concat1rt info: IntervalsAlignment{combined: { -1.28, 1.27 }, preferablePrecisions: {i8}}PrecisionPreserved{value: true},Precisions{precisions: {u8}},QuantizationAlignment{value: false}{f32} {1, 6, 299, 299}FakeQuantizename: fakeQuantize2rt info: IntervalsAlignment{combined: { -1.28, 1.27 }, preferablePrecisions: {i8}}{f32} {1, 3, 299, 299}Precisions {precisions: {u8}}Parametername: input2{f32} {1, 3, 299, 299}Constant{f32} {1, 1, 1, 1}value: [-0.64]Constant{f32} {1, 1, 1, 1}value: [0.635]Constant{f32} {1, 1, 1, 1}value: [-0.64]Constant{f32} {1, 1, 1, 1}value: [0.635]FakeQuantizename: fakeQuantize3rt info: IntervalsAlignment{combined: { -1.28, 1.27 }, preferablePrecisions: {i8}}{f32} {1, 3, 299, 299}Precisions {precisions: {u8}}Parametername: input3{f32} {1, 3, 299, 299}Constant{f32} {1, 1, 1, 1}value: -1.28]Constant{f32} {1, 1, 1, 1}value: [12.7]Constant{f32} {1, 1, 1, 1}value: [-1.28]Constant{f32} {1, 1, 1, 1}value: [1.27]ResultConcatname: concat2rt info: IntervalsAlignment{combined: { -1.28, 1.27 }, preferablePrecisions: {i8}}PrecisionPreserved{value: true},Precisions {precisions: {u8}},QuantizationAlignment{value: true}{f32} {1, 6, 299, 299}AvgPoolname: maxPoolrt info: IntervalsAlignment{combined: { -1.28, 1.27 }, preferablePrecisions: {i8}}PrecisionPreserved{value: true}Precisions {precisions: {u8}}QuantizationAlignment{value: true}{f32} {1, 6, 299, 299}Convolutionname: convolution2in0: {f32}[1,6,7,7]: PerTensorQuantization, Precisions {precisions: {u8}}in1: {f32}[9,6,1,1]: Precisions {precisions: {i8}}{f32} {1, 6, 299, 299}Constant{i8} {9, 6, 1, 1}Dequantization on weightsMultiply{f32} {9, 6, 1, 1}Convert{f32} {9, 6, 1, 1}Constant{f32} {6, 1, 1, 1}Subtract{f32} {9, 6, 1, 1}Constant{i8} {6, 1, 1, 1}Convert{f32} {6, 1, 1, 1}Convolutionname: convolution1{f32} {1, 9, 299, 299}Constant{f32} {9, 6, 1, 1} \ No newline at end of file diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step2_markup_original.png b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step2_markup_original.png new file mode 100644 index 00000000000000..3d0a7abe126511 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step2_markup_original.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:911d9730e6762a9919fe3a48f0c87a44a5aeac97468f2d28c5174c13c69ad74b +size 351583 diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step2_markup_original.svg b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step2_markup_original.svg new file mode 100644 index 00000000000000..3663d9f8898819 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step2_markup_original.svg @@ -0,0 +1 @@ +FakeQuantizename: fakeQuantize1levels: 256{f32} {1, 3, 299, 299}Parametername: input1{f32} {1, 3, 299, 299}Constant{f32} {1, 1, 1, 1}value: [-0.42667]Constant{f32} {1, 1, 1, 1}value: [0.42333]Constant{f32} {1, 1, 1, 1}value: [-0.42667]Constant{f32} {1, 1, 1, 1}value: [0.42333]ResultConcatname: concat1{f32} {1, 6, 299, 299}FakeQuantizename: fakeQuantize2levels: 256{f32} {1, 3, 299, 299}Parametername: input2{f32} {1, 3, 299, 299}Constant{f32} {1, 1, 1, 1}value: [-0.64]Constant{f32} {1, 1, 1, 1}value: [0.635]Constant{f32} {1, 1, 1, 1}value: [-0.64]Constant{f32} {1, 1, 1, 1}value: [0.635]FakeQuantizename: fakeQuantize3levels: 256{f32} {1, 3, 299, 299}Parametername: input3{f32} {1, 3, 299, 299}Constant{f32} {1, 1, 1, 1}value: -1.28]Constant{f32} {1, 1, 1, 1}value: [12.7]Constant{f32} {1, 1, 1, 1}value: [-1.28]Constant{f32} {1, 1, 1, 1}value: [1.27]ResultConcatname: concat2{f32} {1, 6, 299, 299}AvgPoolname: maxPool{f32} {1, 6, 299, 299}Convolutionname: convolution2{f32} {1, 9, 299, 299}Constant{i8} {9, 6, 1, 1}Dequantization on weightsMultiply{f32} {9, 6, 1, 1}Convert{f32} {9, 6, 1, 1}Constant{f32} {9, 1, 1, 1}Subtract{f32} {9, 6, 1, 1}Constant{i8} {9, 1, 1, 1}Convert{f32} {9, 1, 1, 1}Convolutionname: convolution1{f32} {1, 9, 299, 299}Constant{f32} {9, 6, 1, 1} \ No newline at end of file diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step3_original.png b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step3_original.png new file mode 100644 index 00000000000000..7c06e5b0f1f495 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step3_original.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:06caa4dc97b00f150395abc230bc90822f3bfa4e0bb3b65019f111a5a40e1d1c +size 520155 diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step3_original.svg b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step3_original.svg new file mode 100644 index 00000000000000..69717ff6fce47d --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step3_original.svg @@ -0,0 +1 @@ +FakeQuantizename: fakeQuantize1rt info: IntervalsAlignment{combined: { -1.28, 1.27 }, preferablePrecisions: {i8}}{f32} {1, 3, 299, 299}Precisions {precisions: {u8}}Parametername: input1{f32} {1, 3, 299, 299}Constant{f32} {1, 1, 1, 1}value: [-0.42667]Constant{f32} {1, 1, 1, 1}value: [0.42333]Constant{f32} {1, 1, 1, 1}value: [-0.42667]Constant{f32} {1, 1, 1, 1}value: [0.42333]ResultConcatname: concat1rt info: IntervalsAlignment{combined: { -1.28, 1.27 }, preferablePrecisions: {i8}}PrecisionPreserved{value: true},Precisions{precisions: {u8}},QuantizationAlignment{value: false}{f32} {1, 6, 299, 299}FakeQuantizename: fakeQuantize2rt info: IntervalsAlignment{combined: { -1.28, 1.27 }, preferablePrecisions: {i8}}{f32} {1, 3, 299, 299}Precisions {precisions: {u8}}Parametername: input2{f32} {1, 3, 299, 299}Constant{f32} {1, 1, 1, 1}value: [-0.64]Constant{f32} {1, 1, 1, 1}value: [0.635]Constant{f32} {1, 1, 1, 1}value: [-0.64]Constant{f32} {1, 1, 1, 1}value: [0.635]FakeQuantizename: fakeQuantize3rt info: IntervalsAlignment{combined: { -1.28, 1.27 }, preferablePrecisions: {i8}}{f32} {1, 3, 299, 299}Precisions {precisions: {u8}}Parametername: input3{f32} {1, 3, 299, 299}Constant{f32} {1, 1, 1, 1}value: -1.28]Constant{f32} {1, 1, 1, 1}value: [12.7]Constant{f32} {1, 1, 1, 1}value: [-1.28]Constant{f32} {1, 1, 1, 1}value: [1.27]ResultConcatname: concat2rt info: IntervalsAlignment{combined: { -1.28, 1.27 }, preferablePrecisions: {i8}}PrecisionPreserved{value: true},Precisions {precisions: {u8}},QuantizationAlignment{value: true}{f32} {1, 6, 299, 299}AvgPoolname: maxPoolrt info: IntervalsAlignment{combined: { -1.28, 1.27 }, preferablePrecisions: {i8}}PrecisionPreserved{value: true}Precisions {precisions: {u8}}QuantizationAlignment{value: true}{f32} {1, 6, 299, 299}Convolutionname: convolutionin0: {f32}[1,6,7,7]: PerTensorQuantization, Precisions {precisions: {u8}}in1: {f32}[9,6,1,1]: Precisions {precisions: {i8}}{f32} {1, 6, 299, 299}Constant{i8} {9, 6, 1, 1}Dequantization on weightsMultiply{f32} {9, 6, 1, 1}Convert{f32} {9, 6, 1, 1}Constant{f32} {6, 1, 1, 1}Subtract{f32} {9, 6, 1, 1}Constant{i8} {6, 1, 1, 1}Convert{f32} {6, 1, 1, 1}Convolutionname: convolution1{f32} {1, 9, 299, 299}Constant{f32} {9, 6, 1, 1} \ No newline at end of file diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step3_transformed.png b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step3_transformed.png new file mode 100644 index 00000000000000..cf65091c10b763 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step3_transformed.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:8f19d8f068afa4aa62fc04cfa0d2678e6bfe3f90c164a08f588bff9685854030 +size 661189 diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step3_transformed.svg b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step3_transformed.svg new file mode 100644 index 00000000000000..3b90c028118784 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/img/step3_transformed.svg @@ -0,0 +1 @@ +Dequantizations on branch #2INT8 ConvolutionFakeQuantizename: fakeQuantize1rt info: IntervalsAlignment{combined: { -1.28, 1.27 }, preferablePrecisions: {i8}}{u8} {1, 3, 299, 299}Precisions {precisions: {u8}}Parametername: input1{f32} {1, 3, 299, 299}Constant{f32} {1, 1, 1, 1}value: [-0.42667]Constant{f32} {1, 1, 1, 1}value: [0.42333]Constant{f32} {1, 1, 1, 1}value: [0.0]Constant{f32} {1, 1, 1, 1}value: [255.0]ResultConcatname: concat1rt info: IntervalsAlignment{combined: { -1.28, 1.27 }, preferablePrecisions: {i8}}PrecisionPreserved{value: true},Precisions{precisions: {u8}},QuantizationAlignment{value: false}{u8} {1, 6, 299, 299}FakeQuantizename: fakeQuantize2rt info: IntervalsAlignment{combined: { -1.28, 1.27 }, preferablePrecisions: {i8}}{u8} {1, 3, 299, 299}Precisions {precisions: {u8}}Parametername: input2{f32} {1, 3, 299, 299}Constant{f32} {1, 1, 1, 1}value: [-0.64]Constant{f32} {1, 1, 1, 1}value: [0.635]Constant{f32} {1, 1, 1, 1}value: [64]Constant{f32} {1, 1, 1, 1}value: [192]FakeQuantizename: fakeQuantize3rt info: IntervalsAlignment{combined: { -1.28, 1.27 }, preferablePrecisions: {i8}}{u8} {1, 3, 299, 299}Precisions {precisions: {u8}}Parametername: input3{f32} {1, 3, 299, 299}Constant{f32} {1, 1, 1, 1}value: -1.28]Constant{f32} {1, 1, 1, 1}value: [12.7]Constant{f32} {1, 1, 1, 1}value: [0.0]Constant{f32} {1, 1, 1, 1}value: [255.0]ResultConcatname: concat2rt info: IntervalsAlignment{combined: { -1.28, 1.27 }, preferablePrecisions: {i8}}PrecisionPreserved{value: true},Precisions {precisions: {u8}},QuantizationAlignment{value: true}{u8} {1, 6, 299, 299}AvgPoolname: maxPoolrt info: IntervalsAlignment{combined: { -1.28, 1.27 }, preferablePrecisions: {i8}}PrecisionPreserved{value: true}Precisions {precisions: {u8}}QuantizationAlignment{value: true}{u8} {1, 6, 299, 299}Convolutionname: convolutionin0: {f32}[1,6,7,7]: PerTensorQuantization, Precisions {precisions: {u8}}in1: {f32}[9,6,1,1]: Precisions {precisions: {i8}}{f32} {1, 6, 299, 299}Constant{i8} {9, 6, 1, 1}Convolutionname: convolution1{f32} {1, 9, 299, 299}Constant{f32} {9, 6, 1, 1}Dequantizations on branch #1Multiply{f32} {1, 6, 299, 299}Convert{f32} {1, 6, 299, 299}Constant{f32} {1, 6, 1, 1}Subtract{f32} {1, 6, 299, 299}Subtract{f32} {1, 6, 299, 299}Constant{u8} {1, 6, 1, 1}Constant{f32} {1, 6, 1, 1}Multiply{f32} {1, 6, 299, 299}Constant{f32} {1, 6, 1, 1}Subtract{f32} {9, 6, 1, 1}Constant{i8} {6, 1, 1, 1}Zero point on activationsZero point on weights \ No newline at end of file diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/step1_prerequisites.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/step1_prerequisites.md new file mode 100644 index 00000000000000..71d082054cd779 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/step1_prerequisites.md @@ -0,0 +1,6 @@ +# Step 1. Prerequisites Transformations {#openvino_docs_IE_DG_lpt_step1_prerequisites} + +Prerequisites transformations are optional. The transformations prepare a model before running other low precision transformations. The transformations do not operate with dequantization operations or update precisions. Prerequisites transformations include: +* [PullReshapeThroughDequantization](@ref openvino_docs_IE_DG_lpt_PullReshapeThroughDequantization) +* [PullTransposeThroughDequantization](@ref openvino_docs_IE_DG_lpt_PullTransposeThroughDequantization) +* [LinOpSequenceFusion](@ref openvino_docs_IE_DG_lpt_LinOpSequenceFusion) \ No newline at end of file diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/step2_markup.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/step2_markup.md new file mode 100644 index 00000000000000..8d32ffef000faa --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/step2_markup.md @@ -0,0 +1,140 @@ +# Step 2. Markup Transformations {#openvino_docs_IE_DG_lpt_step2_markup} + +This step defines the optimal `FakeQuantize` decomposition precisions for the best inference performance via operations markup with runtime attribute instances. Attributes are created for input and output ports and operations. Transformations do not change the operation output port precisions. A model markup low precision logic is decomposed and implemented into the following common markup transformations. The order of transformations is important: + +1. [MarkupCanBeQuantized](@ref openvino_docs_IE_DG_lpt_MarkupCanBeQuantized) +2. [MarkupPrecisions](@ref openvino_docs_IE_DG_lpt_MarkupPrecisions) +3. [MarkupPerTensorQuantization](@ref openvino_docs_IE_DG_lpt_MarkupPerTensorQuantization) +4. [MarkupAvgPoolPrecisionPreserved](@ref openvino_docs_IE_DG_lpt_MarkupAvgPoolPrecisionPreserved) +5. [PropagatePrecisions](@ref openvino_docs_IE_DG_lpt_PropagatePrecisions) +6. [AlignQuantizationIntervals](@ref openvino_docs_IE_DG_lpt_AlignQuantizationIntervals) +7. [AlignQuantizationParameters](@ref openvino_docs_IE_DG_lpt_AlignQuantizationParameters) + +The table of transformations and used attributes: + +| Transformation name | Create attributes | Use attributes | +|---------------------------------|-------------------------------|-------------------------------------------| +| MarkupCanBeQuantized | Precisions | | +| MarkupPrecisions | Precisions,PrecisionPreserved | | +| MarkupPerTensorQuantization | PerTensorQuantization | | +| MarkupAvgPoolPrecisionPreserved | AvgPoolPrecisionPreserved | Precisions, PrecisionPreserved | +| PropagatePrecisions | Precisions | Precisions, PrecisionPreserved | +| AlignQuantizationIntervals | IntervalsAlignment | PrecisionPreserved | +| AlignQuantizationParameters | QuantizationAlignment | PrecisionPreserved, PerTensorQuantization | + +> **Note:** the same type of attribute instances can be created in different transformations. This approach is the result of the transformation single-responsibility principle. For example, `Precision` attribute instances are created in `MarkupCanBeQuantized` and `MarkupPrecisions` transformations, but the reasons for their creation are different + +Common markup transformations can be decomposed into simpler utility markup transformations. The order of Markup utility transformations is not important: +* [CreateAttribute](@ref openvino_docs_IE_DG_lpt_CreateAttribute) +* [CreatePrecisionsDependentAttribute](@ref openvino_docs_IE_DG_lpt_CreatePrecisionsDependentAttribute) +* [PropagateThroughPrecisionPreserved](@ref openvino_docs_IE_DG_lpt_PropagateThroughPrecisionPreserved) +* [PropagateToInput](@ref openvino_docs_IE_DG_lpt_PropagateToInput) +* [UpdateSharedPrecisionPreserved](@ref openvino_docs_IE_DG_lpt_UpdateSharedPrecisionPreserved) + +Let's explore all transformations and their relations in detail, using one and the same model: + +![](img/step2_markup_original.png) + +The original model key features: +* The first `concat1` concatenation operation has not quantized `convolution1` consumer. +* The second `concat2` concatenation operation has quantized `convolution2` consumer with requirements: + - support `unsigned int8` on activations, + - per-tensor quantization. +* Between the `concat2` concatenation operation and `Convolution` there is an `AvgPool` operation, which mathematically should return an `f32` tensor. But the `MarkupAvgPoolPrecisionPreserved` transformation is active. This allows the low precision transformation, that goes after the `AvgPool`, to propagate low precision tensor to the next consumer. + +Transformations are run with the following parameters: + +@snippet snippets/lpt_mkldnn_plugin.cpp lpt_markup_pipeline + +## 1. MarkupCanBeQuantized +The transformation marks operations that cannot be quantized. No attributes are required before the transformation. + +Changes in the example model after `MarkupCanBeQuantized` transformation: +* Not quantized `convolution1` operation is marked by the `Precisions` attribute with empty values. This attribute allows the next transformation to ignore not quantized operation. + +Result model: + +![MarkupCanBeQuantized](img/step2_markup1.png) + +Model display features (here and below): +* The attributes added by the current transformation are marked in bold. +* If attributes do not fit into one line, then one line consists of only one attribute. + +## 2. MarkupPrecisions +The transformation is required and includes two tasks: +1. Mark operation input ports (create `Precision` attribute instance) by provided restrictions: input port index and required precisions. Restrictions are provided as input argument in `ngraph::pass::low_precision::LowPrecision` constructor. +2. Mark precision preserved operations. + +No attributes are required before the transformation. Changes in the example model after `MarkupPrecisions` transformation: +* Both concatenation operations are marked as precision preserved operations. It allows to propagate precision via these operations. +* Quantized `convolution2` operation is marked by the `Precisions` attribute with `u8` precision on activations and `i8` precisions on weights according to the provided restrictions. This attribute instance allows to specify which precisions are required for quantized `Convolution` operation. + +Result model: + +![MarkupPrecisions result](img/step2_markup2.png) + +## 3. MarkupPerTensorQuantization +The transformation is required and marks operations (create `PerTensorQuantization` attribute instance) by provided restrictions: an operation that requires per-tensor quantization. No attributes are required before the transformation. + +Changes in the example model after `MarkupPerTensorQuantization` transformation: +* both `Convolution` operations are marked by `PerTensorQuantization` + +Result model: + +![MarkupPerTensorQuantization result](img/step2_markup3.png) + +## 4. MarkupAvgPoolPrecisionPreserved +The transformation is optional. `MarkupAvgPoolPrecisionPreserved` marks `AvgPool` operations as precision preserved or not precision preserved. `AvgPool` operation is precision preserved if next not precision preserved operation can be inferred in low precision. In other words, `AvgPool` operations become precision preserved operations to speed up model inference. The transformation uses `PrecisionPreserved` attributes created before. The transformation is combined and uses: +* CreatePrecisionsDependentAttribute +* PropagateThroughPrecisionPreserved +* UpdateSharedPrecisionPreserved + +Changes in the example model after `MarkupAvgPoolPrecisionPreserved` transformation: +* `AvgPool` operations are marked by `PrecisionPreserved` and `AvgPoolPrecisionPreserved` (not used below). + +Result model: + +![MarkupAvgPoolPrecisionPreserved](img/step2_markup4.png) + +## 5. PropagatePrecisions +The transformation is required. `PropagatePrecision` is a key transformation in the markup pipeline, which marks `FakeQuantize` output port precisions. The transformation uses `PrecisionPreserved` attribute instances created before. The transformation is combined and uses: + +* CreateAttribute +* PropagateThroughPrecisionPreserved +* PropagateToInput + +Changes in the example model after `PropagatePrecisions` transformation: +* All precision preserved operations are marked by the `Precisions` attribute instance, which defines the required precision for the operation. +* `FakeQuantize` operation output ports are marked by `Precisions` attribute instances, which define target precision for decomposition. In the sample model, `FakeQuantize` operations have signed intervals, but the `Precisions` attributes are initialized by `u8` (`unsigned int8`) values as the result applied during transformations restrictions for `Convolution` operations. + +Result model: + +![PropagatePrecisions](img/step2_markup5.png) + +> **NOTE**: `AlignQuantizationIntervals` and `AlignQuantizationParameters` transformations are required if the model has quantized concatenation operations. + +## 6. AlignQuantizationIntervals +The transformation is required for models with the quantized operation. The transformation marks `FakeQuantize` operation and precision preserved consumers to combine quantization information from different `FakeQuantize` operations for future quantization intervals alignment. The transformation is combined and uses: +* CreateAttribute +* PropagateThroughPrecisionPreserved + +Changes in the example model after `AlignQuantizationIntervals` transformation: +* All `FakeQuantize` operations and their precision preserved consumers are marked by the `IntervalsAlignment` attribute instance. + +Result model: + +![AlignQuantizationIntervals](img/step2_markup6.png) + +## 7. AlignQuantizationParameters +The transformation is required for models with quantized concatenation operation. The transformation marks `FakeQuantize` precision preserved consumers to align quantization intervals. The transformation is combined and uses: +* CreateAttribute +* PropagateThroughPrecisionPreserved +* UpdateSharedPrecisionPreserved + + +Changes in the example model after `AlignQuantizationParameters` transformation: +* All `FakeQuantize` precision preserved consumers are marked by `QuantizationAlignment` attribute instance. `convolution1` input ports are marked by `Precisions` attribute instances with empty precisions collection. As a result, the `convolution1` operation was detected as not quantized, and the `QuantizationAlignment` attribute default value `false` does not change. `convolution2` input ports are marked by `Precisions` attribute instances with not empty precisions collection. `convolution2` operation was detected as quantized with the `PerTensorQuantization` attribute, and the `QuantizationAlignment` attribute default value changed to `true`. + +Final model: + +![AlignQuantizationParameters](img/step2_markup7.png) diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/step3_main.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/step3_main.md new file mode 100644 index 00000000000000..81a07a82125511 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/step3_main.md @@ -0,0 +1,49 @@ +# Step 3. Main Transformations {#openvino_docs_IE_DG_lpt_step3_main} + +Main transformations are the majority of low precision transformations. Transformations operate with dequantization operations. Main transformations include: +* [AddTransformation](@ref openvino_docs_IE_DG_lpt_AddTransformation) +* [AvgPoolTransformation](@ref openvino_docs_IE_DG_lpt_AvgPoolTransformation) +* [ClampTransformation](@ref openvino_docs_IE_DG_lpt_AvgPoolTransformation) +* [ConcatTransformation](@ref openvino_docs_IE_DG_lpt_ConcatTransformation) +* [ConvolutionTransformation](@ref openvino_docs_IE_DG_lpt_ConvolutionTransformation) +* [ConvolutionBackpropDataTransformation](@ref openvino_docs_IE_DG_lpt_ConvolutionBackpropDataTransformation) +* [DepthToSpaceTransformation](@ref openvino_docs_IE_DG_lpt_DepthToSpaceTransformation) +* [FakeQuantizeDecompositionTransformation](@ref openvino_docs_IE_DG_lpt_FakeQuantizeDecompositionTransformation) +* [FakeQuantizeTransformation](@ref openvino_docs_IE_DG_lpt_FakeQuantizeTransformation) +* [InterpolateTransformation](@ref openvino_docs_IE_DG_lpt_InterpolateTransformation) +* [GroupConvolutionTransformation](@ref openvino_docs_IE_DG_lpt_GroupConvolutionTransformation) +* [MatMulTransformation](@ref openvino_docs_IE_DG_lpt_MatMulTransformation) +* [MaxPoolTransformation](@ref openvino_docs_IE_DG_lpt_MaxPoolTransformation) +* [MultiplyTransformation](@ref openvino_docs_IE_DG_lpt_MultiplyTransformation) +* [MVNTransformation](@ref openvino_docs_IE_DG_lpt_MVNTransformation) +* [NormalizeL2Transformation](@ref openvino_docs_IE_DG_lpt_NormalizeL2Transformation) +* [PReluTransformation](@ref openvino_docs_IE_DG_lpt_PReluTransformation) +* [ReduceMaxTransformation](@ref openvino_docs_IE_DG_lpt_ReduceMaxTransformation) +* [ReduceMeanTransformation](@ref openvino_docs_IE_DG_lpt_ReduceMeanTransformation) +* [ReduceMinTransformation](@ref openvino_docs_IE_DG_lpt_ReduceMinTransformation) +* [ReduceSumTransformation](@ref openvino_docs_IE_DG_lpt_ReduceSumTransformation) +* [ReluTransformation](@ref openvino_docs_IE_DG_lpt_ReluTransformation) +* [ReshapeTransformation](@ref openvino_docs_IE_DG_lpt_ReshapeTransformation) +* [SqueezeTransformation](@ref openvino_docs_IE_DG_lpt_SqueezeTransformation) +* [ShuffleChannelsTransformation](@ref openvino_docs_IE_DG_lpt_ShuffleChannelsTransformation) +* [SplitTransformation](@ref openvino_docs_IE_DG_lpt_SplitTransformation) +* [StridedSliceTransformation](@ref openvino_docs_IE_DG_lpt_StridedSliceTransformation) +* [TransposeTransformation](@ref openvino_docs_IE_DG_lpt_TransposeTransformation) +* [UnsqueezeTransformation](@ref openvino_docs_IE_DG_lpt_UnsqueezeTransformation) +* [VariadicSplitTransformation](@ref openvino_docs_IE_DG_lpt_VariadicSplitTransformation) + +Let's explore some main transformations on the example model. Original model: + +![Original model](img/step3_original.png) + +Result model after main transformations: + +![Original model](img/step3_transformed.png) + +Changes in the example model after main transformation: +* All `FakeQuantize` operations (`fakeQuantize1`, `fakeQuantize2` and `fakeQuantize3`) were decomposed: + - original `FakeQuantize` operations were replaced with new operations with other output intervals and output port precision, + - dequantization operations. +* Dequantization operations were moved via precision preserved (`concat1` and `concat2`) and quantized (`convolution2`) operations. + +> **Note:** the left branch (branch #1) does not require per-tensor quantization. As a result, the `fakeQuantize1`output interval is [0, 255]. But quantized `convolution2` requires per-tensor quantization on the right branch (branch #2). Then all connected `FakeQuantize` interval operations (`fakeQuantize1` and `fakeQuantize2`) are aligned to have per-tensor quantization after the concatenation (`concat2`) operation. diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/step4_cleanup.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/step4_cleanup.md new file mode 100644 index 00000000000000..0b4913273c6207 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/step4_cleanup.md @@ -0,0 +1,8 @@ +# Step 4. Cleanup Transformations {#openvino_docs_IE_DG_lpt_step4_cleanup} + +* [FoldConvertTransformation](@ref openvino_docs_IE_DG_lpt_FoldConvertTransformation) +* [FoldFakeQuantizeTransformation](@ref openvino_docs_IE_DG_lpt_FoldFakeQuantizeTransformation) +* [FuseConvertTransformation](@ref openvino_docs_IE_DG_lpt_FuseConvertTransformation) +* [FuseMultiplyToFakeQuantizeTransformation](@ref openvino_docs_IE_DG_lpt_FuseMultiplyToFakeQuantizeTransformation) +* [FuseSubtractToFakeQuantizeTransformation](@ref openvino_docs_IE_DG_lpt_FuseSubtractToFakeQuantizeTransformation) +* [MultiplyToGroupConvolutionTransformation](@ref openvino_docs_IE_DG_lpt_MultiplyToGroupConvolutionTransformation) \ No newline at end of file diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/quantization/img/fq.common.png b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/quantization/img/fq.common.png new file mode 100644 index 00000000000000..7cacc57924a894 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/quantization/img/fq.common.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:288dec05908449cc3fa5e07700fac5cbdff17bb4b4035a4ee83c44cbc6c22c70 +size 59664 diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/quantization/img/fq.common.svg b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/quantization/img/fq.common.svg new file mode 100644 index 00000000000000..056a1424ba78ad --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/quantization/img/fq.common.svg @@ -0,0 +1 @@ +FakeQuantizeFakeQuantizelevels: 256{f32} {1, 3, 299, 299}Parameter{f32} {1, 3, 299, 299}Constant{f32} {1, 1, 1, 1}value:[-12.8]Constant{f32} {1, 1, 1, 1}value:[12.7]Constant{f32} {1, 1, 1, 1}value:[-12.8]Constant{f32} {1, 1, 1, 1}value:[12.7]Result \ No newline at end of file diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/quantization/img/fq.transformed.png b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/quantization/img/fq.transformed.png new file mode 100644 index 00000000000000..34967b7e05d46b --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/quantization/img/fq.transformed.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:8e345c0b2b5fe365ed298d40d3add4b06a8106096186f68dccb5131c01194e72 +size 102546 diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/quantization/img/fq.transformed.svg b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/quantization/img/fq.transformed.svg new file mode 100644 index 00000000000000..2eae59b9572b55 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/quantization/img/fq.transformed.svg @@ -0,0 +1 @@ +DequantizationQuantizationFakeQuantizelevels: 256{u8} {1, 3, 299, 299}Parameter{f32} {1, 3, 299, 299}Multiply{f32} {1, 3, 299, 299}Constant{f32} {1, 1, 1, 1}value:[-12.8]Constant{f32} {1, 1, 1, 1}value:[12.7]Constant{f32} {1, 1, 1, 1}value:[0]Constant{f32} {1, 1, 1, 1}value:[255]Constant{f32} {}ResultConvert{f32} {1, 3, 299, 299}Subtract{f32} {1, 3, 299, 299}Constant{f32} {} \ No newline at end of file diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step1_prerequisites/convert_subtract_constant.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step1_prerequisites/convert_subtract_constant.md new file mode 100644 index 00000000000000..49011c482f5c7c --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step1_prerequisites/convert_subtract_constant.md @@ -0,0 +1,3 @@ +# ConvertSubtractConstant transformation {#openvino_docs_IE_DG_lpt_ConvertSubtractConstant} + +ngraph::pass::low_precision::ConvertSubtractConstant class represents the `ConvertSubtractConstant` transformation. diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step1_prerequisites/lin_op_sequence_fusion.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step1_prerequisites/lin_op_sequence_fusion.md new file mode 100644 index 00000000000000..14e23a6175842c --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step1_prerequisites/lin_op_sequence_fusion.md @@ -0,0 +1,5 @@ +# LinOpSequenceFusion transformation {#openvino_docs_IE_DG_lpt_LinOpSequenceFusion} + +ngraph::pass::LinOpSequenceFusion class represents the `LinOpSequenceFusion` transformation. + +`LinOpSequenceFusion` is common nGraph transformation. diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step1_prerequisites/pull_reshape_through_dequantization.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step1_prerequisites/pull_reshape_through_dequantization.md new file mode 100644 index 00000000000000..214e8ac99932fe --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step1_prerequisites/pull_reshape_through_dequantization.md @@ -0,0 +1,3 @@ +# PullReshapeThroughDequantization transformation {#openvino_docs_IE_DG_lpt_PullReshapeThroughDequantization} + +ngraph::pass::low_precision::PullReshapeThroughDequantization class represents the `PullReshapeThroughDequantization` transformation. \ No newline at end of file diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step1_prerequisites/pull_transpose_through_dequantization.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step1_prerequisites/pull_transpose_through_dequantization.md new file mode 100644 index 00000000000000..1acd058af162fc --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step1_prerequisites/pull_transpose_through_dequantization.md @@ -0,0 +1,3 @@ +# PullTransposeThroughDequantization transformation {#openvino_docs_IE_DG_lpt_PullTransposeThroughDequantization} + +ngraph::pass::low_precision::PullTransposeThroughDequantization class represents the `PullTransposeThroughDequantization` transformation. diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step2_markup/align_quantization_intervals.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step2_markup/align_quantization_intervals.md new file mode 100644 index 00000000000000..b41afd0b8f352f --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step2_markup/align_quantization_intervals.md @@ -0,0 +1,3 @@ +# AlignQuantizationIntervals transformation {#openvino_docs_IE_DG_lpt_AlignQuantizationIntervals} + +ngraph::pass::low_precision::AlignQuantizationIntervals class represents the `AlignQuantizationIntervals` transformation. diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step2_markup/align_quantization_parameters.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step2_markup/align_quantization_parameters.md new file mode 100644 index 00000000000000..7477d96dbbf4c5 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step2_markup/align_quantization_parameters.md @@ -0,0 +1,3 @@ +# AlignQuantizationParameters transformation {#openvino_docs_IE_DG_lpt_AlignQuantizationParameters} + +ngraph::pass::low_precision::AlignQuantizationParameters class represents the `AlignQuantizationParameters` transformation. diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step2_markup/create_attribute.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step2_markup/create_attribute.md new file mode 100644 index 00000000000000..118ce14305ea08 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step2_markup/create_attribute.md @@ -0,0 +1,3 @@ +# CreateAttribute transformation {#openvino_docs_IE_DG_lpt_CreateAttribute} + +ngraph::pass::low_precision::CreateAttribute class represents the `CreateAttribute` transformation. \ No newline at end of file diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step2_markup/create_precisions_dependent_attribute.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step2_markup/create_precisions_dependent_attribute.md new file mode 100644 index 00000000000000..c747462e4c9e9a --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step2_markup/create_precisions_dependent_attribute.md @@ -0,0 +1,3 @@ +# CreatePrecisionsDependentAttribute transformation {#openvino_docs_IE_DG_lpt_CreatePrecisionsDependentAttribute} + +ngraph::pass::low_precision::CreatePrecisionsDependentAttribute class represents the `CreatePrecisionsDependentAttribute` transformation. diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step2_markup/markup_avg_pool_precision_preserved.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step2_markup/markup_avg_pool_precision_preserved.md new file mode 100644 index 00000000000000..4d9a97ffc47cd4 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step2_markup/markup_avg_pool_precision_preserved.md @@ -0,0 +1,3 @@ +# MarkupAvgPoolPrecisionPreserved transformation {#openvino_docs_IE_DG_lpt_MarkupAvgPoolPrecisionPreserved} + +ngraph::pass::low_precision::MarkupAvgPoolPrecisionPreserved class represents the `MarkupAvgPoolPrecisionPreserved` transformation. diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step2_markup/markup_can_be_quantized.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step2_markup/markup_can_be_quantized.md new file mode 100644 index 00000000000000..1bd149e332a163 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step2_markup/markup_can_be_quantized.md @@ -0,0 +1,3 @@ +# MarkupCanBeQuantized transformation {#openvino_docs_IE_DG_lpt_MarkupCanBeQuantized} + +ngraph::pass::low_precision::MarkupCanBeQuantized class represents the `MarkupCanBeQuantized` transformation. diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step2_markup/markup_per_tensor_quantization.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step2_markup/markup_per_tensor_quantization.md new file mode 100644 index 00000000000000..d915ef73183111 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step2_markup/markup_per_tensor_quantization.md @@ -0,0 +1,3 @@ +# MarkupPerTensorQuantization transformation {#openvino_docs_IE_DG_lpt_MarkupPerTensorQuantization} + +ngraph::pass::low_precision::MarkupPerTensorQuantization class represents the `MarkupPerTensorQuantization` transformation. diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step2_markup/markup_precisions.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step2_markup/markup_precisions.md new file mode 100644 index 00000000000000..673a8932529384 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step2_markup/markup_precisions.md @@ -0,0 +1,3 @@ +# MarkupPrecisions transformation {#openvino_docs_IE_DG_lpt_MarkupPrecisions} + +ngraph::pass::low_precision::MarkupPrecisions class represents the `MarkupPrecisions` transformation. diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step2_markup/propagate_precisions.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step2_markup/propagate_precisions.md new file mode 100644 index 00000000000000..50dcc23ce96e70 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step2_markup/propagate_precisions.md @@ -0,0 +1,3 @@ +# PropagatePrecisions transformation {#openvino_docs_IE_DG_lpt_PropagatePrecisions} + +ngraph::pass::low_precision::PropagatePrecisions class represents the `PropagatePrecisions` transformation. diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step2_markup/propagate_shared_value.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step2_markup/propagate_shared_value.md new file mode 100644 index 00000000000000..e7f93dd64f0643 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step2_markup/propagate_shared_value.md @@ -0,0 +1,3 @@ +# PropagateSharedValue transformation {#openvino_docs_IE_DG_lpt_PropagateSharedValue} + +ngraph::pass::low_precision::PropagateSharedValue class represents the `PropagateSharedValue` transformation. diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step2_markup/propagate_through_precision_preserved.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step2_markup/propagate_through_precision_preserved.md new file mode 100644 index 00000000000000..e183b5265d75b2 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step2_markup/propagate_through_precision_preserved.md @@ -0,0 +1,3 @@ +# PropagateThroughPrecisionPreserved transformation {#openvino_docs_IE_DG_lpt_PropagateThroughPrecisionPreserved} + +ngraph::pass::low_precision::PropagateThroughPrecisionPreserved class represents the `PropagateThroughPrecisionPreserved` transformation. diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step2_markup/propagate_to_input.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step2_markup/propagate_to_input.md new file mode 100644 index 00000000000000..08136272cdbe5e --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step2_markup/propagate_to_input.md @@ -0,0 +1,3 @@ +# PropagateToInput transformation {#openvino_docs_IE_DG_lpt_PropagateToInput} + +ngraph::pass::low_precision::PropagateToInput class represents the `PropagateToInput` transformation. diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step2_markup/update_shared_precision_preserved.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step2_markup/update_shared_precision_preserved.md new file mode 100644 index 00000000000000..aa18aea07cdfd3 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step2_markup/update_shared_precision_preserved.md @@ -0,0 +1,3 @@ +# UpdateSharedPrecisionPreserved transformation {#openvino_docs_IE_DG_lpt_UpdateSharedPrecisionPreserved} + +ngraph::pass::low_precision::UpdateSharedPrecisionPreserved class represents the `UpdateSharedPrecisionPreserved` transformation. diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/activation/clamp.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/activation/clamp.md new file mode 100644 index 00000000000000..5e00b6a3ca0b62 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/activation/clamp.md @@ -0,0 +1,3 @@ +# ClampTransformation transformation {#openvino_docs_IE_DG_lpt_ClampTransformation} + +ngraph::pass::low_precision::ClampTransformation class represents the `Clamp` operation transformation. diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/activation/prelu.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/activation/prelu.md new file mode 100644 index 00000000000000..4ffcade1647238 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/activation/prelu.md @@ -0,0 +1,3 @@ +# PReluTransformation transformation {#openvino_docs_IE_DG_lpt_PReluTransformation} + +ngraph::pass::low_precision::PReluTransformation class represents the `PRelu` operation transformation. diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/activation/relu.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/activation/relu.md new file mode 100644 index 00000000000000..8831de7aee6f02 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/activation/relu.md @@ -0,0 +1,3 @@ +# ReluTransformation transformation {#openvino_docs_IE_DG_lpt_ReluTransformation} + +ngraph::pass::low_precision::ReluTransformation class represents the `Relu` operation transformation. diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/arithmetic/add.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/arithmetic/add.md new file mode 100644 index 00000000000000..337c49a9749aed --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/arithmetic/add.md @@ -0,0 +1,57 @@ +# AddTransformation transformation {#openvino_docs_IE_DG_lpt_AddTransformation} + +ngraph::pass::low_precision::AddTransformation class represents the `Add` operation transformation. + +The transformation propagates dequantization subtraction from one input branch to another and propagates dequantization multiplication from the same branch through `Add` operation. In transformation result, one `Add` operation input branch is in low precision without dequantization operations (empty branch), another input branch is in original precision with updated dequantization operations (full branch). + +Criteria for selecting an empty branch in order of priority: + +*Step 1.* If one branch is quantized only, then the quantized branch is an empty branch. + +*Step 2.* If only one branch has `FakeQuantize` before dequantization operations, then another branch is an empty branch. + +*Step 3.* If some `FakeQuantize` has more than one consumer and another has only one, then the branch with `FakeQuantize` with several consumers is an empty branch. + +*Step 4.* Constant branch is in original precision, data branch is an empty branch. In this case, dequantization operations are propagated to a constant branch and will be fused in one constant. + +*Step 5.* If both branches have operations from the following list before `FakeQuantize`: `Convolution`, `GroupConvolution`, and `MatMul`, or do not have any operations from the list, then the branch with larger shape volume is empty. + +*Step 6.* If the operation before `FakeQuantize` has several consumers in any branch, then the branch is empty. + +If dequantization operations on the full branch have a `FakeQuantize` operation parent, then they will be fused with `FakeQuantize` during another low precision transformation. If a `FakeQuantize` operation has a parent operation from the list: `Convolution`, `GroupConvolution`, and `MatMul`, then during inference the `FakeQuantize` can be inferred in one plugin kernel with the parent operation. + +Depending on the plugin instruction set, low precision inference for the `Add` operation can be implemented in two logical steps in one plugin kernel: + + * Inference step #1: Operations in the full branch, for example, `Convolution` and `FakeQuantize` with fused dequantization operations, and `Add` can be inferred in the original precision. + + * Inference step #2: Inference step #1 result can be added with the empty branch tensor in low precision. + +This approach allows to infer the `Add` operation in the optimal way. + +## Subgraph before transformation +The subgraph with quantized `Add` operation before transformation: + +\f[ +y_{ch,i}=(scale1_{ch} * (x1_{ch,i} - shift1_{ch})) + (scale2_{ch} * (x2_{ch,i} - shift2_{ch})) +\f] + +![Add before](img/add.common.png) + +## Subgraph after transformation +The subgraph with the `Add` operation after the transformation: + +\f[ +y_{ch,i}=scale2_{ch} * (scale1_{ch}' * (x1_{ch,i} - shift1_{ch}') + x2_{ch,i}) +\f] + +where: + +\f[ +scale1_{ch}' = scale1_{ch} / scale2_{ch} +\f] + +\f[ +shift1_{ch}' = shift1_{ch} + scale2_{ch} * shift2_{ch} / scale1_{ch} +\f] + +![Add before](img/add.transformed.png) \ No newline at end of file diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/arithmetic/img/add.common.png b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/arithmetic/img/add.common.png new file mode 100644 index 00000000000000..7d05063836fc04 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/arithmetic/img/add.common.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d8d3621c4be5d3382cb164a19676253412f85b5f47fac27b024c726f1571647e +size 380663 diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/arithmetic/img/add.common.svg b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/arithmetic/img/add.common.svg new file mode 100644 index 00000000000000..ee254a66659b18 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/arithmetic/img/add.common.svg @@ -0,0 +1 @@ +QuantizeQuantizeDequantization on activationsMultiply{f32} {1, 3, 299, 299}Convert{f32} {1, 3, 299, 299}Constant{f32} {1, 1, 1, 1}Subtract{f32} {1, 3, 299, 299}Constant{f32} {1, 1, 1, 1}Dequantization on activationsMultiply{f32} {1, 3, 299, 299}Convert{f32} {1, 3, 299, 299}Constant{f32} {1, 1, 1, 1}Subtract{f32} {1, 3, 299, 299}Constant{f32} {1, 1, 1, 1}Add{f32} {1, 3, 299, 299}INT8 Convolution with zero pointSubtract{f32} {1, 3, 299, 299}Convolution{f32} {1, 6, 299, 299}Constant{i8} {6, 3, 1, 1}Constant{u8} {}Subtract{f32} {1, 3, 299, 299}Constant{i8} {6, 1, 1, 1}FakeQuantizename: fakeQuantize1rt info: IntervalsAlignment{combined: { -1.28, 1.27 }, preferablePrecisions: {i8}}{u8} {1, 3, 299, 299}Precisions {precisions: {u8}}Constant{f32} {1, 1, 1, 1}value: [-0.42667]Constant{f32} {1, 1, 1, 1}value: [0.42333]Constant{f32} {1, 1, 1, 1}value: [0.0]Constant{f32} {1, 1, 1, 1}value: [255.0]Add{f32} {1, 3, 299, 299}FakeQuantizename: fakeQuantize1rt info: IntervalsAlignment{combined: { -1.28, 1.27 }, preferablePrecisions: {i8}}{u8} {1, 3, 299, 299}Precisions {precisions: {u8}}Constant{f32} {1, 1, 1, 1}value: [-0.42667]Constant{f32} {1, 1, 1, 1}value: [0.42333]Constant{f32} {1, 1, 1, 1}value: [0.0]Constant{f32} {1, 1, 1, 1}value: [255.0]Branch#2Branch#1 \ No newline at end of file diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/arithmetic/img/add.transformed.png b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/arithmetic/img/add.transformed.png new file mode 100644 index 00000000000000..16a5cc7f127803 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/arithmetic/img/add.transformed.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ff2d26dc0b86f339458a2fafbbd6a88daf3d3dc6fcefb636243f42a6e91bc328 +size 492066 diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/arithmetic/img/add.transformed.svg b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/arithmetic/img/add.transformed.svg new file mode 100644 index 00000000000000..6c7fc6b7b5f41b --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/arithmetic/img/add.transformed.svg @@ -0,0 +1 @@ +Dequantization on activationsQuantizeQuantizeDequantization on activationsMultiply{f32} {1, 3, 299, 299}Convert{f32} {1, 3, 299, 299}Constant{f32} {1, 3, 1, 1}Subtract{f32} {1, 3, 299, 299}Constant{f32} {1, 3, 1, 1}Add{f32} {1, 3, 299, 299}INT8 Convolution with zero pointSubtract{f32} {1, 3, 299, 299}Convolution{f32} {1, 6, 299, 299}Constant{i8} {6, 3, 1, 1}Constant{u8} {}Subtract{f32} {1, 3, 299, 299}Constant{i8} {6, 1, 1, 1}FakeQuantizename: fakeQuantize1rt info: IntervalsAlignment{combined: { -1.28, 1.27 }, preferablePrecisions: {i8}}{u8} {1, 3, 299, 299}Precisions {precisions: {u8}}Constant{f32} {1, 1, 1, 1}value: [-0.42667]Constant{f32} {1, 1, 1, 1}value: [0.42333]Constant{f32} {1, 1, 1, 1}value: [0.0]Constant{f32} {1, 1, 1, 1}value: [255.0]Add{f32} {1, 3, 299, 299}FakeQuantizename: fakeQuantize1rt info: IntervalsAlignment{combined: { -1.28, 1.27 }, preferablePrecisions: {i8}}{u8} {1, 3, 299, 299}Precisions {precisions: {u8}}Constant{f32} {1, 1, 1, 1}value: [-0.42667]Constant{f32} {1, 1, 1, 1}value: [0.42333]Constant{f32} {1, 1, 1, 1}value: [0.0]Constant{f32} {1, 1, 1, 1}value: [255.0]Branch#2Branch#1Multiply{f32} {1, 3, 299, 299}Constant{f32} {1, 3, 1, 1} \ No newline at end of file diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/arithmetic/multiply.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/arithmetic/multiply.md new file mode 100644 index 00000000000000..4389309394493a --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/arithmetic/multiply.md @@ -0,0 +1,3 @@ +# MultiplyTransformation transformation {#openvino_docs_IE_DG_lpt_MultiplyTransformation} + +ngraph::pass::low_precision::MultiplyTransformation class represents the `Multiply` operation transformation. \ No newline at end of file diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/arithmetic/subtract.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/arithmetic/subtract.md new file mode 100644 index 00000000000000..8ba827aaea93cc --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/arithmetic/subtract.md @@ -0,0 +1,3 @@ +# SubtractTransformation transformation {#openvino_docs_IE_DG_lpt_SubtractTransformation} + +ngraph::pass::low_precision::SubtractTransformation class represents the `Subtract` operation transformation. \ No newline at end of file diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/convolution/convolution.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/convolution/convolution.md new file mode 100644 index 00000000000000..c29aa9c5b29d8f --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/convolution/convolution.md @@ -0,0 +1,34 @@ +# ConvolutionTransformation transformation {#openvino_docs_IE_DG_lpt_ConvolutionTransformation} + +ngraph::pass::low_precision::ConvolutionTransformation class represents the `Convolution` operation transformation. + +The transformation propagates dequantization operations on activations and weights through the `Convolution` operation. The transformation supports several weights quantization approaches: +* quantized weights in low precision with dequantization operations, +* weights in original precision with `FakeQuantize` operation. + +Result dequantization `Multiply` constant value *result* is calculated as multiplication for dequantization `Multiply` constant value on activations *a* and dequantization `Multiply` constant value on weights *b* : + +\f[ +result_{i} = a_{i} \cdot b_{i} +\f] + +## Limitations + +* Dequantization on activations must be per-tensor. It means that dequantization `Multiply` constant value on activations must be scalar. + +## Subgraph before transformation + +### Quantized weights in low precision with dequantization operations +The subgraph with quantized `Convolution` before transformation with quantized weights in low precision constant and dequantization operations: + +![Convolution before](img/fq_and_convolution.common.png) + +### Weights in original precision with FakeQuantize operation +The subgraph with quantized `Convolution` before transformation with weights in original precision and `FakeQuantize` operation: + +![Convolution before](img/fq_fq_and_convolution.common.png) + +## Subgraph after transformation +The subgraph with `Convolution` operation after the transformation: + +![Convolution after](img/fq_and_convolution.transformed.png) \ No newline at end of file diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/convolution/convolution_backprop_data.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/convolution/convolution_backprop_data.md new file mode 100644 index 00000000000000..aa9af9f28b8f5d --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/convolution/convolution_backprop_data.md @@ -0,0 +1,3 @@ +# ConvolutionBackpropDataTransformation transformation {#openvino_docs_IE_DG_lpt_ConvolutionBackpropDataTransformation} + +ngraph::pass::low_precision::ConvolutionBackpropDataTransformation class represents the `ConvolutionBackpropData` operation transformation. diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/convolution/group_convolution.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/convolution/group_convolution.md new file mode 100644 index 00000000000000..c5571fbada39c1 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/convolution/group_convolution.md @@ -0,0 +1,3 @@ +# GroupConvolutionTransformation transformation {#openvino_docs_IE_DG_lpt_GroupConvolutionTransformation} + +ngraph::pass::low_precision::GroupConvolutionTransformation class represents the `GroupConvolution` operation transformation. diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/convolution/img/fq_and_convolution.common.png b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/convolution/img/fq_and_convolution.common.png new file mode 100644 index 00000000000000..7b686f72935d8e --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/convolution/img/fq_and_convolution.common.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9e5bfd5ca52ea6660e0ff67afefc98d64941eab6e8b464116242a6e044f318f5 +size 207602 diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/convolution/img/fq_and_convolution.common.svg b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/convolution/img/fq_and_convolution.common.svg new file mode 100644 index 00000000000000..f45ca77e426854 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/convolution/img/fq_and_convolution.common.svg @@ -0,0 +1 @@ +FP32 Convolution with quantized weightsQuantized weightsDequantization on activationsConvolution{f32} {1, 6, 299, 299}Constant{i8} {6, 3, 1, 1}Dequantization on weightsMultiply{f32} {6, 3, 1, 1}Convert{f32} {6, 3, 1, 1}Constant{f32} {6, 1, 1, 1}Subtract{f32} {6, 3, 1, 1}Constant{i8} {6, 1, 1, 1}Convert{f32} {6, 1, 1, 1}Multiply{f32} {1, 3, 299, 299}Convert{f32} {1, 3, 299, 299}Constant{f32} {1, 1, 1, 1}Subtract{f32} {1, 3, 299, 299}Constant{f32} {1, 1, 1, 1} \ No newline at end of file diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/convolution/img/fq_and_convolution.transformed.png b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/convolution/img/fq_and_convolution.transformed.png new file mode 100644 index 00000000000000..63fa693e6b0a76 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/convolution/img/fq_and_convolution.transformed.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:756c225ee8e1da046e0210bf0696185b3939378f10b4ed6d757e43070d379436 +size 135804 diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/convolution/img/fq_and_convolution.transformed.svg b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/convolution/img/fq_and_convolution.transformed.svg new file mode 100644 index 00000000000000..57bab915e3ac3b --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/convolution/img/fq_and_convolution.transformed.svg @@ -0,0 +1 @@ +DequantizationINT8 Convolution with zero pointSubtract{f32} {1, 3, 299, 299}Multiply{f32} {1, 6, 299, 299}Convolution{f32} {1, 6, 299, 299}Constant{i8} {6, 3, 1, 1}Constant{u8} {}Constant{f32} {1, 6, 1, 1}Subtract{f32} {1, 3, 299, 299}Constant{i8} {6, 1, 1, 1}Zero point on activationsZero point on weights \ No newline at end of file diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/convolution/img/fq_fq_and_convolution.common.png b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/convolution/img/fq_fq_and_convolution.common.png new file mode 100644 index 00000000000000..b887ab2b7642b1 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/convolution/img/fq_fq_and_convolution.common.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:08d4116490ab329636fced24c292636fbe00856976b19e5219e433bc2c6e4e16 +size 190590 diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/convolution/img/fq_fq_and_convolution.common.svg b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/convolution/img/fq_fq_and_convolution.common.svg new file mode 100644 index 00000000000000..c475faa62ac6b7 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/convolution/img/fq_fq_and_convolution.common.svg @@ -0,0 +1 @@ +FP32 Convolution with quantized weightsNot quantized weights in original precisionConvolution{f32} {1, 6, 299, 299}FakeQuantizelevels: 255{f32} {6, 3, 299, 299}Constant{f32} {6, 3, 1, 1}Constant{f32} {1, 1, 1, 1}Value: [-12.8]Constant{f32} {1, 1, 1, 1}Value: [12.7]Constant{f32} {1, 1, 1, 1}Value: [-12.8]Constant{f32} {1, 1, 1, 1}Value: [12.7]Dequantization on activationsMultiply{f32} {1, 3, 299, 299}Convert{f32} {1, 3, 299, 299}Constant{f32} {1, 1, 1, 1}Subtract{f32} {1, 3, 299, 299}Constant{f32} {1, 1, 1, 1} \ No newline at end of file diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/image/interpolate.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/image/interpolate.md new file mode 100644 index 00000000000000..c6d3a3fbc257ab --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/image/interpolate.md @@ -0,0 +1,3 @@ +# InterpolateTransformation transformation {#openvino_docs_IE_DG_lpt_InterpolateTransformation} + +ngraph::pass::low_precision::InterpolateTransformation class represents the `Interpolate` operation transformation. diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/matrix/mat_mul.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/matrix/mat_mul.md new file mode 100644 index 00000000000000..3a54ca5e5747e9 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/matrix/mat_mul.md @@ -0,0 +1,3 @@ +# MatMulTransformation transformation {#openvino_docs_IE_DG_lpt_MatMulTransformation} + +ngraph::pass::low_precision::MatMulTransformation class represents the `MatMul` operation transformation. diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/movement/concat.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/movement/concat.md new file mode 100644 index 00000000000000..698d3e3cc985ed --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/movement/concat.md @@ -0,0 +1,3 @@ +# ConcatTransformation transformation {#openvino_docs_IE_DG_lpt_ConcatTransformation} + +ngraph::pass::low_precision::ConcatTransformation class represents the `Concat` operation transformation. diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/movement/depth_to_space.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/movement/depth_to_space.md new file mode 100644 index 00000000000000..c3ae40d70ba01c --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/movement/depth_to_space.md @@ -0,0 +1,3 @@ +# DepthToSpaceTransformation transformation {#openvino_docs_IE_DG_lpt_DepthToSpaceTransformation} + +ngraph::pass::low_precision::DepthToSpaceTransformation class represents the `DepthToSpace` operation transformation. diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/movement/pad.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/movement/pad.md new file mode 100644 index 00000000000000..feb8561f3c00ec --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/movement/pad.md @@ -0,0 +1,3 @@ +# PadTransformation transformation {#openvino_docs_IE_DG_lpt_PadTransformation} + +ngraph::pass::low_precision::PadTransformation class represents the `Pad` operation transformation. diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/movement/shuffle_channels.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/movement/shuffle_channels.md new file mode 100644 index 00000000000000..e41e1c05aa32b5 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/movement/shuffle_channels.md @@ -0,0 +1,3 @@ +# ShuffleChannelsTransformation transformation {#openvino_docs_IE_DG_lpt_ShuffleChannelsTransformation} + +ngraph::pass::low_precision::ShuffleChannelsTransformation class represents the `ShuffleChannels` operation transformation. diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/movement/split.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/movement/split.md new file mode 100644 index 00000000000000..166ad30e3dc80f --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/movement/split.md @@ -0,0 +1,3 @@ +# SplitTransformation transformation {#openvino_docs_IE_DG_lpt_SplitTransformation} + +ngraph::pass::low_precision::SplitTransformation class represents the `Split` operation transformation. diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/movement/strided_slice.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/movement/strided_slice.md new file mode 100644 index 00000000000000..4b385dc6e73a49 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/movement/strided_slice.md @@ -0,0 +1,3 @@ +# StridedSliceTransformation transformation {#openvino_docs_IE_DG_lpt_StridedSliceTransformation} + +ngraph::pass::low_precision::StridedSliceTransformation class represents the `StridedSlice` operation transformation. diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/movement/transpose.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/movement/transpose.md new file mode 100644 index 00000000000000..bcf2ac02c502c0 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/movement/transpose.md @@ -0,0 +1,3 @@ +# TransposeTransformation transformation {#openvino_docs_IE_DG_lpt_TransposeTransformation} + +ngraph::pass::low_precision::TransposeTransformation class represents the `Transpose` operation transformation. diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/movement/variadic_split.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/movement/variadic_split.md new file mode 100644 index 00000000000000..10bc02ead1c255 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/movement/variadic_split.md @@ -0,0 +1,3 @@ +# VariadicSplitTransformation transformation {#openvino_docs_IE_DG_lpt_VariadicSplitTransformation} + +ngraph::pass::low_precision::VariadicSplitTransformation class represents the `VariadicSplit` operation transformation. diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/normalization/mvn.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/normalization/mvn.md new file mode 100644 index 00000000000000..3b712696b5447c --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/normalization/mvn.md @@ -0,0 +1,3 @@ +# MVNTransformation transformation {#openvino_docs_IE_DG_lpt_MVNTransformation} + +ngraph::pass::low_precision::MVNTransformation class represents the `MVN` operation transformation. diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/normalization/normalize_l2.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/normalization/normalize_l2.md new file mode 100644 index 00000000000000..6f86660f1a50ae --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/normalization/normalize_l2.md @@ -0,0 +1,3 @@ +# NormalizeL2Transformation transformation {#openvino_docs_IE_DG_lpt_NormalizeL2Transformation} + +ngraph::pass::low_precision::NormalizeL2Transformation class represents the `NormalizeL2` operation transformation. diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/pooling/avg_pool.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/pooling/avg_pool.md new file mode 100644 index 00000000000000..d53a8e28a783f6 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/pooling/avg_pool.md @@ -0,0 +1,3 @@ +# AvgPoolTransformation transformation {#openvino_docs_IE_DG_lpt_AvgPoolTransformation} + +ngraph::pass::low_precision::AvgPoolTransformation class represents the `AvgPool` operation transformation. diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/pooling/max_pool.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/pooling/max_pool.md new file mode 100644 index 00000000000000..ce7f2a28c7cb51 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/pooling/max_pool.md @@ -0,0 +1,3 @@ +# MaxPoolTransformation transformation {#openvino_docs_IE_DG_lpt_MaxPoolTransformation} + +ngraph::pass::low_precision::MaxPoolTransformation class represents the `MaxPool` operation transformation. diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/quantization/fake_quantize.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/quantization/fake_quantize.md new file mode 100644 index 00000000000000..8441554f637c82 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/quantization/fake_quantize.md @@ -0,0 +1,3 @@ +# FakeQuantizeTransformation transformation {#openvino_docs_IE_DG_lpt_FakeQuantizeTransformation} + +ngraph::pass::low_precision::FakeQuantizeTransformation class represents the `FakeQuantize` operation transformation. diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/quantization/fold_fake_quantize.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/quantization/fold_fake_quantize.md new file mode 100644 index 00000000000000..34ec1af1b0abea --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/quantization/fold_fake_quantize.md @@ -0,0 +1,3 @@ +# FoldFakeQuantizeTransformation transformation {#openvino_docs_IE_DG_lpt_FoldFakeQuantizeTransformation} + +ngraph::pass::low_precision::FoldFakeQuantizeTransformation class represents the `FoldFakeQuantize` operation transformation. diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/reduction/reduce_max.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/reduction/reduce_max.md new file mode 100644 index 00000000000000..27153c02125288 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/reduction/reduce_max.md @@ -0,0 +1,3 @@ +# ReduceMaxTransformation transformation {#openvino_docs_IE_DG_lpt_ReduceMaxTransformation} + +ngraph::pass::low_precision::ReduceMaxTransformation class represents the `ReduceMax` operation transformation. diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/reduction/reduce_mean.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/reduction/reduce_mean.md new file mode 100644 index 00000000000000..ca05bd56a8d32a --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/reduction/reduce_mean.md @@ -0,0 +1,3 @@ +# ReduceMeanTransformation transformation {#openvino_docs_IE_DG_lpt_ReduceMeanTransformation} + +ngraph::pass::low_precision::ReduceMeanTransformation class represents the `ReduceMean` operation transformation. diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/reduction/reduce_min.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/reduction/reduce_min.md new file mode 100644 index 00000000000000..0d5d0f74fd7803 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/reduction/reduce_min.md @@ -0,0 +1,3 @@ +# ReduceMinTransformation transformation {#openvino_docs_IE_DG_lpt_ReduceMinTransformation} + +ngraph::pass::low_precision::ReduceMinTransformation class represents the `ReduceMin` operation transformation. diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/reduction/reduce_sum.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/reduction/reduce_sum.md new file mode 100644 index 00000000000000..b67ebf5d3a0062 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/reduction/reduce_sum.md @@ -0,0 +1,3 @@ +# ReduceSumTransformation transformation {#openvino_docs_IE_DG_lpt_ReduceSumTransformation} + +ngraph::pass::low_precision::ReduceSumTransformation class represents the `ReduceSum` operation transformation. diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/shape/reshape.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/shape/reshape.md new file mode 100644 index 00000000000000..b4c69a720bc458 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/shape/reshape.md @@ -0,0 +1,3 @@ +# ReshapeTransformation transformation {#openvino_docs_IE_DG_lpt_ReshapeTransformation} + +ngraph::pass::low_precision::ReshapeTransformation class represents the `Reshape` operation transformation. diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/shape/squeeze.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/shape/squeeze.md new file mode 100644 index 00000000000000..a409c8ca61c923 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/shape/squeeze.md @@ -0,0 +1,3 @@ +# SqueezeTransformation transformation {#openvino_docs_IE_DG_lpt_SqueezeTransformation} + +ngraph::pass::low_precision::SqueezeTransformation class represents the `Squeeze` operation transformation. diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/shape/unsqueeze.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/shape/unsqueeze.md new file mode 100644 index 00000000000000..a9ffac0fa4aa01 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step3_main/shape/unsqueeze.md @@ -0,0 +1,3 @@ +# UnsqueezeTransformation transformation {#openvino_docs_IE_DG_lpt_UnsqueezeTransformation} + +ngraph::pass::low_precision::UnsqueezeTransformation class represents the `Unsqueeze` operation transformation. diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step4_cleanup/fake_quantize_decomposition.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step4_cleanup/fake_quantize_decomposition.md new file mode 100644 index 00000000000000..83c4eb3d9e674c --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step4_cleanup/fake_quantize_decomposition.md @@ -0,0 +1,3 @@ +# FakeQuantizeDecompositionTransformation transformation {#openvino_docs_IE_DG_lpt_FakeQuantizeDecompositionTransformation} + +ngraph::pass::low_precision::FakeQuantizeDecompositionTransformation class represents the `FakeQuantizeDecompositionTransformation` transformation. diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step4_cleanup/fold_convert.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step4_cleanup/fold_convert.md new file mode 100644 index 00000000000000..c84e19da98e19d --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step4_cleanup/fold_convert.md @@ -0,0 +1,3 @@ +# FoldConvertTransformation transformation {#openvino_docs_IE_DG_lpt_FoldConvertTransformation} + +ngraph::pass::low_precision::FoldConvertTransformation class represents the `FoldConvertTransformation` transformation. diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step4_cleanup/fuse_convert.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step4_cleanup/fuse_convert.md new file mode 100644 index 00000000000000..3b720729c7fad2 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step4_cleanup/fuse_convert.md @@ -0,0 +1,3 @@ +# FuseConvertTransformation transformation {#openvino_docs_IE_DG_lpt_FuseConvertTransformation} + +ngraph::pass::low_precision::FuseConvertTransformation class represents the `FuseConvertTransformation` transformation. diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step4_cleanup/fuse_multiply_to_fake_quantize.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step4_cleanup/fuse_multiply_to_fake_quantize.md new file mode 100644 index 00000000000000..10cab1a1788f00 --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step4_cleanup/fuse_multiply_to_fake_quantize.md @@ -0,0 +1,3 @@ +# FuseMultiplyToFakeQuantizeTransformation transformation {#openvino_docs_IE_DG_lpt_FuseMultiplyToFakeQuantizeTransformation} + +ngraph::pass::low_precision::FuseMultiplyToFakeQuantizeTransformation class represents the `FuseMultiplyToFakeQuantizeTransformation` transformation. diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step4_cleanup/fuse_subtract_to_fake_quantize.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step4_cleanup/fuse_subtract_to_fake_quantize.md new file mode 100644 index 00000000000000..7bd326435d68bd --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step4_cleanup/fuse_subtract_to_fake_quantize.md @@ -0,0 +1,3 @@ +# FuseSubtractToFakeQuantizeTransformation transformation {#openvino_docs_IE_DG_lpt_FuseSubtractToFakeQuantizeTransformation} + +ngraph::pass::low_precision::FuseSubtractToFakeQuantizeTransformation class represents the `FuseSubtractToFakeQuantizeTransformation` transformation. diff --git a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step4_cleanup/multiply_to_group_convolution.md b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step4_cleanup/multiply_to_group_convolution.md new file mode 100644 index 00000000000000..27742998abdf6b --- /dev/null +++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/transformations/step4_cleanup/multiply_to_group_convolution.md @@ -0,0 +1,3 @@ +# MultiplyToGroupConvolutionTransformation transformation {#openvino_docs_IE_DG_lpt_MultiplyToGroupConvolutionTransformation} + +ngraph::pass::low_precision::MultiplyToGroupConvolutionTransformation class represents the `MultiplyToGroupConvolutionTransformation` transformation. diff --git a/docs/documentation.md b/docs/documentation.md index bd4444f14696ce..e421351fca7170 100644 --- a/docs/documentation.md +++ b/docs/documentation.md @@ -75,6 +75,7 @@ Inference Engine Plugin Developer Guide groupie_dev_api + Plugin Transformation Pipeline .. toctree:: :maxdepth: 1 diff --git a/docs/doxygen/ie_docs.xml b/docs/doxygen/ie_docs.xml new file mode 100644 index 00000000000000..3a0f6d854eb4c8 --- /dev/null +++ b/docs/doxygen/ie_docs.xml @@ -0,0 +1,383 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + \ No newline at end of file diff --git a/docs/snippets/lpt_mkldnn_plugin.cpp b/docs/snippets/lpt_mkldnn_plugin.cpp new file mode 100644 index 00000000000000..1808b011c37b6e --- /dev/null +++ b/docs/snippets/lpt_mkldnn_plugin.cpp @@ -0,0 +1,221 @@ +#include + +#include + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +namespace ngraph { +namespace pass { +namespace device { + +class ConvertOpSet1ToDeviceSpecific: public ngraph::pass::FunctionPass { +public: + bool run_on_function(std::shared_ptr f) override { + return true; + } +}; + +} // namespace device +} // pass +} // ngraph + +int main() { +std::shared_ptr nGraphFunc; +ngraph::pass::Manager manager; +auto pass_config = manager.get_pass_config(); +//! [lpt_common] +// check if the function is quantized to ignore LPT transformations for not quantized function to speed up model loading +const bool useLpt = ngraph::pass::low_precision::LowPrecision::isFunctionQuantized(nGraphFunc); +if (useLpt) { + // disable constant folding on constant subgraph to use the subgraph for LPT + manager.register_pass(std::vector{ + ngraph::element::i8, ngraph::element::u8, ngraph::element::i4, ngraph::element::u4 + }); +} + +// nGraph common transformations happen here + +if (useLpt) { + // convert subtract constant to INT8 to prevent unnecessary FP16 to FP32 conversion + manager.register_pass(std::vector{ + ngraph::element::i8, ngraph::element::u8, ngraph::element::i4, ngraph::element::u4 }); +} + +// nGraph common transformations happen here + +if (useLpt) { + // convert not supported cases FakeQuantize -> Convert -> Convert -> Subtract -> Multiply to a single FakeQuantize + pass_config->set_callback([](const std::shared_ptr &node) -> bool { + return ngraph::pass::low_precision::NetworkHelper::areQuantizeAndDequantizeSupportedForMultiply(node); + }); + + // convert not supported cases FakeQuantize -> Convert -> Convert -> Subtract -> Multiply to a single FakeQuantize + pass_config->set_callback([](const std::shared_ptr &node) -> bool { + return ngraph::pass::low_precision::NetworkHelper::areQuantizeAndDequantizeSupportedForSubtract(node); + }); +} + +manager.run_passes(nGraphFunc); +//! [lpt_common] + +//! [lpt_execution] +using namespace ngraph::pass::low_precision; +if (useLpt) { + // Low precision transformations plugin specific configuration: restrictions definition + auto supportedPrecisions = std::vector({ + OperationPrecisionRestriction::create({ + {0, {ngraph::element::u8}}, + {1, {ngraph::element::i8}}, + }), + OperationPrecisionRestriction::create({ + {0, {ngraph::element::u8, ngraph::element::i8}}, + {1, {ngraph::element::i8}} + }), + OperationPrecisionRestriction::create({ + {0, {ngraph::element::u8}}, + {1, {ngraph::element::i8}} + }), + OperationPrecisionRestriction::create({ + {0, {ngraph::element::u8}}, + {1, {ngraph::element::i8}}, + }), + }); + + // Low precision transformations plugin specific configuration: per-tensor quantization operations definition + auto perTensorQuantization = std::vector({ + OperationPerTensorQuantizationRestriction::create({0}), + OperationPerTensorQuantizationRestriction::create({0}) + }); + + // Low precision transformations instantiation and registration in pass manager + ngraph::pass::Manager lptManager; + lptManager.register_pass(supportedPrecisions, perTensorQuantization); + + // Low precision transformations plugin specific configuration: transformation callbacks definition + lptManager.get_pass_config()->set_callback([](const std::shared_ptr& node) -> bool { + if (const auto multiply = std::dynamic_pointer_cast(node)) { + return !MultiplyToGroupConvolutionTransformation::canBeTransformedToGroupConvolution(multiply); + } + return false; + }); + lptManager.get_pass_config()->set_callback([](const std::shared_ptr& node) -> bool { + return LayerTransformation::isAsymmetricQuantization(node) || WeightableLayerTransformation::isAsymmetricOnWeights(node); + }); + lptManager.get_pass_config()->set_callback([](const std::shared_ptr& node) -> bool { + return MultiplyToGroupConvolutionTransformation::isDynamicOrScalar(node); + }); + + // Low precision transformations execution + lptManager.run_passes(nGraphFunc); +} +//! [lpt_execution] + +//! [lpt_device] +ngraph::pass::Manager deviceSpecificManager; +deviceSpecificManager.register_pass(); +deviceSpecificManager.run_passes(nGraphFunc); +//! [lpt_device] + +return 0; +} + +int lpt_supported_precisions() { +std::shared_ptr nGraphFunc; +ngraph::pass::Manager manager; + +using namespace ngraph::pass::low_precision; +//! [lpt_supported_precisions] +auto supportedPrecisions = std::vector({ + OperationPrecisionRestriction::create({ + {0, {ngraph::element::u8}}, + {1, {ngraph::element::i8}}, + }), +}); + +ngraph::pass::Manager lptManager; +lptManager.register_pass(supportedPrecisions); +lptManager.run_passes(nGraphFunc); +//! [lpt_supported_precisions] + +ngraph::pass::Manager deviceSpecificManager; +deviceSpecificManager.register_pass(); +deviceSpecificManager.run_passes(nGraphFunc); + +return 0; +} + +int per_tensor_quantization() { +std::shared_ptr nGraphFunc; +//! [per_tensor_quantization] +using namespace ngraph::pass::low_precision; + +const std::vector emptyRestrictions; + +auto perTensorQuantization = std::vector({ + OperationPerTensorQuantizationRestriction::create({0}) +}); + +ngraph::pass::Manager lptManager; +lptManager.register_pass(emptyRestrictions, perTensorQuantization); +lptManager.run_passes(nGraphFunc); +//! [per_tensor_quantization] + +return 0; +} + +int asymmetric_quantization() { +std::shared_ptr nGraphFunc; +ngraph::pass::Manager manager; +auto pass_config = manager.get_pass_config(); + + +//! [asymmetric_quantization] +using namespace ngraph::pass::low_precision; +ngraph::pass::Manager lptManager; +lptManager.register_pass(); +lptManager.get_pass_config()->set_callback([](const std::shared_ptr& node) -> bool { + return LayerTransformation::isAsymmetricQuantization(node) || WeightableLayerTransformation::isAsymmetricOnWeights(node); +}); +lptManager.run_passes(nGraphFunc); +//! [asymmetric_quantization] + +return 0; +} + +int lpt_markup_pipeline() { +std::shared_ptr nGraphFunc; +ngraph::pass::Manager manager; + +using namespace ngraph::pass::low_precision; +//! [lpt_markup_pipeline] +auto supportedPrecisions = std::vector({ + OperationPrecisionRestriction::create({ + {0, {ngraph::element::u8}}, + {1, {ngraph::element::i8}}, + }), +}); + +auto perTensorQuantization = std::vector({ + OperationPerTensorQuantizationRestriction::create({0}) +}); + +ngraph::pass::Manager lptManager; +lptManager.register_pass(supportedPrecisions, perTensorQuantization); +lptManager.run_passes(nGraphFunc); +//! [lpt_markup_pipeline] + +ngraph::pass::Manager deviceSpecificManager; +deviceSpecificManager.register_pass(); +deviceSpecificManager.run_passes(nGraphFunc); + +return 0; +} diff --git a/src/common/low_precision_transformations/include/low_precision/add.hpp b/src/common/low_precision_transformations/include/low_precision/add.hpp index 92caba9f382a5f..f5bfc4d06fbe1d 100644 --- a/src/common/low_precision_transformations/include/low_precision/add.hpp +++ b/src/common/low_precision_transformations/include/low_precision/add.hpp @@ -11,6 +11,15 @@ namespace ngraph { namespace pass { namespace low_precision { +/** + * @ingroup ie_transformation_common_api + * @brief AddTransformation propagates dequantization subtraction from one input branch to another and + * propagates dequantization multiplication from the same branch through Add operation. + * + * For more details about the transformation, refer to + * [AddTransformation](@ref openvino_docs_IE_DG_lpt_AddTransformation) page + * in the Inference Engine Developer Guide. + */ class LP_TRANSFORMATIONS_API AddTransformation : public EltwiseBaseTransformation { public: NGRAPH_RTTI_DECLARATION; diff --git a/src/common/low_precision_transformations/include/low_precision/align_quantization_intervals.hpp b/src/common/low_precision_transformations/include/low_precision/align_quantization_intervals.hpp index 87befcfd24fe82..63500cf39b69e3 100644 --- a/src/common/low_precision_transformations/include/low_precision/align_quantization_intervals.hpp +++ b/src/common/low_precision_transformations/include/low_precision/align_quantization_intervals.hpp @@ -18,6 +18,15 @@ class LP_TRANSFORMATIONS_API AlignQuantizationIntervals; } // namespace pass } // namespace ngraph +/** + * @ingroup ie_transformation_common_api + * @brief AlignQuantizationIntervals transformation marks precision preserved operations subgraph by `IntervalsAlignmentAttribute` + * after FakeQuantize operations. + * + * For more details about the transformation, refer to + * [AlignQuantizationIntervals](@ref openvino_docs_IE_DG_lpt_AlignQuantizationIntervals) page + * in the Inference Engine Developer Guide. + */ class ngraph::pass::low_precision::AlignQuantizationIntervals : public ngraph::pass::FunctionPass { public: NGRAPH_RTTI_DECLARATION; diff --git a/src/common/low_precision_transformations/include/low_precision/align_quantization_parameters.hpp b/src/common/low_precision_transformations/include/low_precision/align_quantization_parameters.hpp index 1b354c5fd5c3f1..f45d447cdfe04d 100644 --- a/src/common/low_precision_transformations/include/low_precision/align_quantization_parameters.hpp +++ b/src/common/low_precision_transformations/include/low_precision/align_quantization_parameters.hpp @@ -19,6 +19,15 @@ class LP_TRANSFORMATIONS_API AlignQuantizationParameters; } // namespace pass } // namespace ngraph +/** + * @ingroup ie_transformation_common_api + * @brief AlignQuantizationParameters transformation marks precision preserved operations subgraph by `QuantizationAlignmentAttribute` + * attribute after FakeQuantize operations. + * + * For more details about the transformation, refer to + * [AlignQuantizationParameters](@ref openvino_docs_IE_DG_lpt_AlignQuantizationParameters) page + * in the Inference Engine Developer Guide. + */ class ngraph::pass::low_precision::AlignQuantizationParameters : public ngraph::pass::FunctionPass { public: NGRAPH_RTTI_DECLARATION; diff --git a/src/common/low_precision_transformations/include/low_precision/avg_pool.hpp b/src/common/low_precision_transformations/include/low_precision/avg_pool.hpp index 12d5eaf7e2abe2..f6fa113fc8a432 100644 --- a/src/common/low_precision_transformations/include/low_precision/avg_pool.hpp +++ b/src/common/low_precision_transformations/include/low_precision/avg_pool.hpp @@ -11,6 +11,14 @@ namespace ngraph { namespace pass { namespace low_precision { +/** + * @ingroup ie_transformation_common_api + * @brief AvgPoolTransformation propagates dequantization operations through AvgPool operation. + * + * For more details about the transformation, refer to + * [AvgPoolTransformation](@ref openvino_docs_IE_DG_lpt_AvgPoolTransformation) page + * in the Inference Engine Developer Guide. + */ class LP_TRANSFORMATIONS_API AvgPoolTransformation : public LayerTransformation { public: NGRAPH_RTTI_DECLARATION; diff --git a/src/common/low_precision_transformations/include/low_precision/clamp.hpp b/src/common/low_precision_transformations/include/low_precision/clamp.hpp index a3cf76a1284470..0af98ae690a561 100644 --- a/src/common/low_precision_transformations/include/low_precision/clamp.hpp +++ b/src/common/low_precision_transformations/include/low_precision/clamp.hpp @@ -12,6 +12,14 @@ namespace ngraph { namespace pass { namespace low_precision { +/** + * @ingroup ie_transformation_common_api + * @brief ClampTransformation propagates dequantization operations through Clamp operation. + * + * For more details about the transformation, refer to + * [ClampTransformation](@ref openvino_docs_IE_DG_lpt_ClampTransformation) page + * in the Inference Engine Developer Guide. + */ class LP_TRANSFORMATIONS_API ClampTransformation : public LayerTransformation { public: NGRAPH_RTTI_DECLARATION; diff --git a/src/common/low_precision_transformations/include/low_precision/concat.hpp b/src/common/low_precision_transformations/include/low_precision/concat.hpp index c1f752972ad3cc..448b600f99445e 100644 --- a/src/common/low_precision_transformations/include/low_precision/concat.hpp +++ b/src/common/low_precision_transformations/include/low_precision/concat.hpp @@ -19,6 +19,14 @@ namespace ngraph { namespace pass { namespace low_precision { +/** + * @ingroup ie_transformation_common_api + * @brief ConcatTransformation propagates dequantization operations through Concat operation. + * + * For more details about the transformation, refer to + * [ConcatTransformation](@ref openvino_docs_IE_DG_lpt_ConcatTransformation) page + * in the Inference Engine Developer Guide. + */ class LP_TRANSFORMATIONS_API ConcatTransformation : public LayerTransformation { public: NGRAPH_RTTI_DECLARATION; diff --git a/src/common/low_precision_transformations/include/low_precision/convert_subtract_constant.hpp b/src/common/low_precision_transformations/include/low_precision/convert_subtract_constant.hpp index f9584eb6842e60..d03b4895538309 100644 --- a/src/common/low_precision_transformations/include/low_precision/convert_subtract_constant.hpp +++ b/src/common/low_precision_transformations/include/low_precision/convert_subtract_constant.hpp @@ -20,6 +20,15 @@ class LP_TRANSFORMATIONS_API ConvertSubtractConstant; } // namespace pass } // namespace ngraph +/** + * @ingroup ie_transformation_common_api + * @brief ConvertSubtractConstant marks Convert operations on constant subgraph by DISABLED_CONSTANT_FOLDING attribute + * to prevent constant folding. + * + * For more details about the transformation, refer to + * [ConvertSubtractConstant](@ref openvino_docs_IE_DG_lpt_ConvertSubtractConstant) page + * in the Inference Engine Developer Guide. + */ class ngraph::pass::low_precision::ConvertSubtractConstant : public ngraph::pass::MatcherPass { public: NGRAPH_RTTI_DECLARATION; diff --git a/src/common/low_precision_transformations/include/low_precision/convolution.hpp b/src/common/low_precision_transformations/include/low_precision/convolution.hpp index b49fcc89c4aee4..c124f1a7bf9aa8 100644 --- a/src/common/low_precision_transformations/include/low_precision/convolution.hpp +++ b/src/common/low_precision_transformations/include/low_precision/convolution.hpp @@ -11,6 +11,14 @@ namespace ngraph { namespace pass { namespace low_precision { +/** + * @ingroup ie_transformation_common_api + * @brief ConvolutionTransformation propagates dequantization operations through Convolution operation. + * + * For more details about the transformation, refer to + * [ConvolutionTransformation](@ref openvino_docs_IE_DG_lpt_ConvolutionTransformation) page + * in the Inference Engine Developer Guide. + */ class LP_TRANSFORMATIONS_API ConvolutionTransformation : public WeightableLayerTransformation { public: NGRAPH_RTTI_DECLARATION; diff --git a/src/common/low_precision_transformations/include/low_precision/convolution_backprop_data.hpp b/src/common/low_precision_transformations/include/low_precision/convolution_backprop_data.hpp index a1176e71efffc7..c64cc7198c251d 100644 --- a/src/common/low_precision_transformations/include/low_precision/convolution_backprop_data.hpp +++ b/src/common/low_precision_transformations/include/low_precision/convolution_backprop_data.hpp @@ -11,6 +11,14 @@ namespace ngraph { namespace pass { namespace low_precision { +/** + * @ingroup ie_transformation_common_api + * @brief ConvolutionBackpropDataTransformation propagates dequantization operations through ConvolutionBackpropData operation. + * + * For more details about the transformation, refer to + * [ConvolutionBackpropDataTransformation](@ref openvino_docs_IE_DG_lpt_ConvolutionBackpropDataTransformation) page in + * the Inference Engine Developer Guide. + */ class LP_TRANSFORMATIONS_API ConvolutionBackpropDataTransformation : public WeightableLayerTransformation { public: ConvolutionBackpropDataTransformation(const Params& params = Params()); diff --git a/src/common/low_precision_transformations/include/low_precision/create_attribute.hpp b/src/common/low_precision_transformations/include/low_precision/create_attribute.hpp index c7b7c4688262d5..8388003778b561 100644 --- a/src/common/low_precision_transformations/include/low_precision/create_attribute.hpp +++ b/src/common/low_precision_transformations/include/low_precision/create_attribute.hpp @@ -31,6 +31,13 @@ enum class AttributeSource { OutputPort }; +/** + * @ingroup ie_transformation_common_api + * @brief CreateAttribute transformation marks OperationType operations by AttributeType attribute. + * + * For more details about the transformation, refer to + * [CreateAttribute](@ref openvino_docs_IE_DG_lpt_CreateAttribute) page in the Inference Engine Developer Guide. + */ template class ngraph::pass::low_precision::CreateAttribute : public ngraph::pass::low_precision::BaseMatcherPass { public: diff --git a/src/common/low_precision_transformations/include/low_precision/create_precisions_dependent_attribute.hpp b/src/common/low_precision_transformations/include/low_precision/create_precisions_dependent_attribute.hpp index 2e05cc85761fa4..e157940b12d1ba 100644 --- a/src/common/low_precision_transformations/include/low_precision/create_precisions_dependent_attribute.hpp +++ b/src/common/low_precision_transformations/include/low_precision/create_precisions_dependent_attribute.hpp @@ -29,6 +29,15 @@ class CreatePrecisionsDependentAttribute; } // namespace pass } // namespace ngraph +/** + * @ingroup ie_transformation_common_api + * @brief CreatePrecisionsDependentAttribute transformation marks OperationType operations by + * PrecisionPreservedAttribute and AttributeType attributes with the same shared part. + * + * For more details about the transformation, refer to + * [CreatePrecisionsDependentAttribute](@ref openvino_docs_IE_DG_lpt_CreatePrecisionsDependentAttribute) page + * in the Inference Engine Developer Guide. + */ template class ngraph::pass::low_precision::CreatePrecisionsDependentAttribute : public ngraph::pass::MatcherPass { public: diff --git a/src/common/low_precision_transformations/include/low_precision/depth_to_space.hpp b/src/common/low_precision_transformations/include/low_precision/depth_to_space.hpp index 5a199454eb6f0e..20e21110f5629c 100644 --- a/src/common/low_precision_transformations/include/low_precision/depth_to_space.hpp +++ b/src/common/low_precision_transformations/include/low_precision/depth_to_space.hpp @@ -10,6 +10,14 @@ namespace ngraph { namespace pass { namespace low_precision { +/** + * @ingroup ie_transformation_common_api + * @brief DepthToSpaceTransformation propagates dequantization operations through DepthToSpace operation. + * + * For more details about the transformation, refer to + * [DepthToSpaceTransformation](@ref openvino_docs_IE_DG_lpt_DepthToSpaceTransformation) page + * in the Inference Engine Developer Guide. + */ class LP_TRANSFORMATIONS_API DepthToSpaceTransformation : public TransparentBaseTransformation { public: NGRAPH_RTTI_DECLARATION; diff --git a/src/common/low_precision_transformations/include/low_precision/eltwise_base_transformation.hpp b/src/common/low_precision_transformations/include/low_precision/eltwise_base_transformation.hpp index c648d6efadc4b0..312dd5af31a16c 100644 --- a/src/common/low_precision_transformations/include/low_precision/eltwise_base_transformation.hpp +++ b/src/common/low_precision_transformations/include/low_precision/eltwise_base_transformation.hpp @@ -12,6 +12,10 @@ namespace ngraph { namespace pass { namespace low_precision { +/** + * @ingroup ie_transformation_common_api + * @brief EltwiseBaseTransformation is base class for element-wise LPT transformations. + */ class LP_TRANSFORMATIONS_API EltwiseBaseTransformation : public LayerTransformation { public: EltwiseBaseTransformation(const Params& params) : LayerTransformation(params) {} diff --git a/src/common/low_precision_transformations/include/low_precision/fake_quantize.hpp b/src/common/low_precision_transformations/include/low_precision/fake_quantize.hpp index 6a3f84b6b4cfe7..cb56422246769f 100644 --- a/src/common/low_precision_transformations/include/low_precision/fake_quantize.hpp +++ b/src/common/low_precision_transformations/include/low_precision/fake_quantize.hpp @@ -13,6 +13,14 @@ namespace ngraph { namespace pass { namespace low_precision { +/** + * @ingroup ie_transformation_common_api + * @brief FakeQuantizeTransformation fuses dequantization operations into FakeQuantize operation. + * + * For more details about the transformation, refer to + * [FakeQuantizeTransformation](@ref openvino_docs_IE_DG_lpt_FakeQuantizeTransformation) page + * in the Inference Engine Developer Guide. + */ class LP_TRANSFORMATIONS_API FakeQuantizeTransformation : public LayerTransformation { public: NGRAPH_RTTI_DECLARATION; diff --git a/src/common/low_precision_transformations/include/low_precision/fake_quantize_decomposition.hpp b/src/common/low_precision_transformations/include/low_precision/fake_quantize_decomposition.hpp index 45948ca32b72ad..7123fbe0157b16 100644 --- a/src/common/low_precision_transformations/include/low_precision/fake_quantize_decomposition.hpp +++ b/src/common/low_precision_transformations/include/low_precision/fake_quantize_decomposition.hpp @@ -13,6 +13,15 @@ namespace ngraph { namespace pass { namespace low_precision { +/** + * @ingroup ie_transformation_common_api + * @brief FakeQuantizeDecompositionTransformation decomposes FakeQuantize operations to quantize + * (FakeQuantize with changes output intervals and low precision output type) and dequantize operations. + * + * For more details about the transformation, refer to + * [FakeQuantizeDecompositionTransformation](@ref openvino_docs_IE_DG_lpt_FakeQuantizeDecompositionTransformation) page + * in the Inference Engine Developer Guide. + */ class LP_TRANSFORMATIONS_API FakeQuantizeDecompositionTransformation : public LayerTransformation { public: NGRAPH_RTTI_DECLARATION; diff --git a/src/common/low_precision_transformations/include/low_precision/fold_convert.hpp b/src/common/low_precision_transformations/include/low_precision/fold_convert.hpp index 4390b7290e2f60..0c5fd8cf0025fa 100644 --- a/src/common/low_precision_transformations/include/low_precision/fold_convert.hpp +++ b/src/common/low_precision_transformations/include/low_precision/fold_convert.hpp @@ -12,6 +12,14 @@ namespace ngraph { namespace pass { namespace low_precision { +/** + * @ingroup ie_transformation_common_api + * @brief FoldConvertTransformation evaluates Convert operation on Subtract constant subgraph. + * + * For more details about the transformation, refer to + * [FoldConvertTransformation](@ref openvino_docs_IE_DG_lpt_FoldConvertTransformation) page + * in the Inference Engine Developer Guide. + */ class LP_TRANSFORMATIONS_API FoldConvertTransformation : public LayerTransformation { public: NGRAPH_RTTI_DECLARATION; diff --git a/src/common/low_precision_transformations/include/low_precision/fold_fake_quantize.hpp b/src/common/low_precision_transformations/include/low_precision/fold_fake_quantize.hpp index 7f2862fc942288..474fd4dfe8e926 100644 --- a/src/common/low_precision_transformations/include/low_precision/fold_fake_quantize.hpp +++ b/src/common/low_precision_transformations/include/low_precision/fold_fake_quantize.hpp @@ -11,6 +11,14 @@ namespace ngraph { namespace pass { namespace low_precision { +/** + * @ingroup ie_transformation_common_api + * @brief FoldFakeQuantizeTransformation evaluate FakeQuantize operations. + * + * For more details about the transformation, refer to + * [FoldFakeQuantizeTransformation](@ref openvino_docs_IE_DG_lpt_FoldFakeQuantizeTransformation) page + * in the Inference Engine Developer Guide. + */ class LP_TRANSFORMATIONS_API FoldFakeQuantizeTransformation : public LayerTransformation { public: NGRAPH_RTTI_DECLARATION; diff --git a/src/common/low_precision_transformations/include/low_precision/fuse_convert.hpp b/src/common/low_precision_transformations/include/low_precision/fuse_convert.hpp index 4ccc59808ad129..24ee1ee89949ab 100644 --- a/src/common/low_precision_transformations/include/low_precision/fuse_convert.hpp +++ b/src/common/low_precision_transformations/include/low_precision/fuse_convert.hpp @@ -12,6 +12,14 @@ namespace ngraph { namespace pass { namespace low_precision { +/** + * @ingroup ie_transformation_common_api + * @brief FuseConvertTransformation fuses Convert operation with Multiply, Subtract or Add operations. + * + * For more details about the transformation, refer to + * [FuseConvertTransformation](@ref openvino_docs_IE_DG_lpt_FuseConvertTransformation) page + * in the Inference Engine Developer Guide. + */ class LP_TRANSFORMATIONS_API FuseConvertTransformation : public LayerTransformation { public: NGRAPH_RTTI_DECLARATION; diff --git a/src/common/low_precision_transformations/include/low_precision/fuse_multiply_to_fake_quantize.hpp b/src/common/low_precision_transformations/include/low_precision/fuse_multiply_to_fake_quantize.hpp index d43aa87441eb29..335eb292be9d5e 100644 --- a/src/common/low_precision_transformations/include/low_precision/fuse_multiply_to_fake_quantize.hpp +++ b/src/common/low_precision_transformations/include/low_precision/fuse_multiply_to_fake_quantize.hpp @@ -12,6 +12,14 @@ namespace ngraph { namespace pass { namespace low_precision { +/** + * @ingroup ie_transformation_common_api + * @brief FuseMultiplyToFakeQuantizeTransformation fuses Multiply operation to FakeQuantize. + * + * For more details about the transformation, refer to + * [FuseMultiplyToFakeQuantizeTransformation](@ref openvino_docs_IE_DG_lpt_FuseMultiplyToFakeQuantizeTransformation) page + * in the Inference Engine Developer Guide. + */ class LP_TRANSFORMATIONS_API FuseMultiplyToFakeQuantizeTransformation : public LayerTransformation { public: NGRAPH_RTTI_DECLARATION; diff --git a/src/common/low_precision_transformations/include/low_precision/fuse_subtract_to_fake_quantize.hpp b/src/common/low_precision_transformations/include/low_precision/fuse_subtract_to_fake_quantize.hpp index 80d6f22f785eff..6b06e1505fcfba 100644 --- a/src/common/low_precision_transformations/include/low_precision/fuse_subtract_to_fake_quantize.hpp +++ b/src/common/low_precision_transformations/include/low_precision/fuse_subtract_to_fake_quantize.hpp @@ -12,6 +12,14 @@ namespace ngraph { namespace pass { namespace low_precision { +/** + * @ingroup ie_transformation_common_api + * @brief FuseSubtractToFakeQuantizeTransformation fuses Subtract operation to FakeQuantize. + * + * For more details about the transformation, refer to + * [FuseSubtractToFakeQuantizeTransformation](@ref openvino_docs_IE_DG_lpt_FuseSubtractToFakeQuantizeTransformation) page + * in the Inference Engine Developer Guide. + */ class LP_TRANSFORMATIONS_API FuseSubtractToFakeQuantizeTransformation : public LayerTransformation { public: NGRAPH_RTTI_DECLARATION; diff --git a/src/common/low_precision_transformations/include/low_precision/group_convolution.hpp b/src/common/low_precision_transformations/include/low_precision/group_convolution.hpp index b54921faf69a01..2e249fd79475ce 100644 --- a/src/common/low_precision_transformations/include/low_precision/group_convolution.hpp +++ b/src/common/low_precision_transformations/include/low_precision/group_convolution.hpp @@ -11,6 +11,14 @@ namespace ngraph { namespace pass { namespace low_precision { +/** + * @ingroup ie_transformation_common_api + * @brief GroupConvolutionTransformation propagates dequantization operations through GroupConvolution operation. + * + * For more details about the transformation, refer to + * [GroupConvolutionTransformation](@ref openvino_docs_IE_DG_lpt_GroupConvolutionTransformation) page + * in the Inference Engine Developer Guide. + */ class LP_TRANSFORMATIONS_API GroupConvolutionTransformation : public ConvolutionTransformation { public: NGRAPH_RTTI_DECLARATION; diff --git a/src/common/low_precision_transformations/include/low_precision/interpolate.hpp b/src/common/low_precision_transformations/include/low_precision/interpolate.hpp index 9d454e59542dd8..cfb7d1c3a80c48 100644 --- a/src/common/low_precision_transformations/include/low_precision/interpolate.hpp +++ b/src/common/low_precision_transformations/include/low_precision/interpolate.hpp @@ -10,6 +10,14 @@ namespace ngraph { namespace pass { namespace low_precision { +/** + * @ingroup ie_transformation_common_api + * @brief InterpolateTransformation propagates dequantization operations through Interpolate operation. + * + * For more details about the transformation, refer to + * [InterpolateTransformation](@ref openvino_docs_IE_DG_lpt_InterpolateTransformation) page + * in the Inference Engine Developer Guide. + */ class LP_TRANSFORMATIONS_API InterpolateTransformation : public LayerTransformation { public: NGRAPH_RTTI_DECLARATION; diff --git a/src/common/low_precision_transformations/include/low_precision/layer_transformation.hpp b/src/common/low_precision_transformations/include/low_precision/layer_transformation.hpp index 7befc214a7d893..f7162d8c6fd78a 100644 --- a/src/common/low_precision_transformations/include/low_precision/layer_transformation.hpp +++ b/src/common/low_precision_transformations/include/low_precision/layer_transformation.hpp @@ -225,7 +225,10 @@ inline std::ostream &operator << (std::ostream &os, const DataPrecision& value) return os; } -// Base class for all LP transformations, holds some common data structures +/** + * @ingroup ie_transformation_common_api + * @brief Base class for low precision transformation. + */ class LP_TRANSFORMATIONS_API LayerTransformation : public ngraph::pass::MatcherPass { static std::vector defaultPrecisions; static std::mutex defaultPrecisionsMutex; diff --git a/src/common/low_precision_transformations/include/low_precision/markup_avg_pool_precision_preserved.hpp b/src/common/low_precision_transformations/include/low_precision/markup_avg_pool_precision_preserved.hpp index eaa9e7878c907d..d2d3f6d75c6f2c 100644 --- a/src/common/low_precision_transformations/include/low_precision/markup_avg_pool_precision_preserved.hpp +++ b/src/common/low_precision_transformations/include/low_precision/markup_avg_pool_precision_preserved.hpp @@ -18,6 +18,14 @@ class LP_TRANSFORMATIONS_API MarkupAvgPoolPrecisionPreserved; } // namespace pass } // namespace ngraph +/** + * @ingroup ie_transformation_common_api + * @brief MarkupAvgPoolPrecisionPreserved transformation marks AvgPool operations as precision preserved or not. + * + * For more details about the transformation, refer to + * [MarkupAvgPoolPrecisionPreserved](@ref openvino_docs_IE_DG_lpt_MarkupAvgPoolPrecisionPreserved) page + * in the Inference Engine Developer Guide. + */ class ngraph::pass::low_precision::MarkupAvgPoolPrecisionPreserved : public ngraph::pass::FunctionPass { public: NGRAPH_RTTI_DECLARATION; diff --git a/src/common/low_precision_transformations/include/low_precision/markup_can_be_quantized.hpp b/src/common/low_precision_transformations/include/low_precision/markup_can_be_quantized.hpp index 7e11d856e97464..81885274cb1e1c 100644 --- a/src/common/low_precision_transformations/include/low_precision/markup_can_be_quantized.hpp +++ b/src/common/low_precision_transformations/include/low_precision/markup_can_be_quantized.hpp @@ -18,6 +18,16 @@ class LP_TRANSFORMATIONS_API MarkupCanBeQuantized; } // namespace pass } // namespace ngraph +/** + * @ingroup ie_transformation_common_api + * @brief MarkupCanBeQuantized transformation marks Convolution, ConvolutionBackpropData, GroupConvolution and Concat + * operations as able to be quantized or not. If an operation is not quantized, then PrecisionsAttribute attribute instance + * is created with empty precisions. + * + * For more details about the transformation, refer to + * [MarkupCanBeQuantized](@ref openvino_docs_IE_DG_lpt_MarkupCanBeQuantized) page + * in the Inference Engine Developer Guide. + */ class ngraph::pass::low_precision::MarkupCanBeQuantized : public ngraph::pass::FunctionPass { public: NGRAPH_RTTI_DECLARATION; diff --git a/src/common/low_precision_transformations/include/low_precision/markup_per_tensor_quantization.hpp b/src/common/low_precision_transformations/include/low_precision/markup_per_tensor_quantization.hpp index 5cdbe43d018121..fda9a25030d4f1 100644 --- a/src/common/low_precision_transformations/include/low_precision/markup_per_tensor_quantization.hpp +++ b/src/common/low_precision_transformations/include/low_precision/markup_per_tensor_quantization.hpp @@ -22,6 +22,15 @@ class LP_TRANSFORMATIONS_API MarkupPerTensorQuantization; } // namespace pass } // namespace ngraph +/** + * @ingroup ie_transformation_common_api + * @brief MarkupPerTensorQuantization transformation marks operations as required per-tensor quantization according to the + * provided restrictions. + * + * For more details about the transformation, refer to + * [MarkupPerTensorQuantization](@ref openvino_docs_IE_DG_lpt_MarkupPerTensorQuantization) page + * in the Inference Engine Developer Guide. + */ class ngraph::pass::low_precision::MarkupPerTensorQuantization : public ngraph::pass::FunctionPass { public: class PerTensorQuantization { diff --git a/src/common/low_precision_transformations/include/low_precision/markup_precisions.hpp b/src/common/low_precision_transformations/include/low_precision/markup_precisions.hpp index 4e5b484f7c4692..87c9a0d0563603 100644 --- a/src/common/low_precision_transformations/include/low_precision/markup_precisions.hpp +++ b/src/common/low_precision_transformations/include/low_precision/markup_precisions.hpp @@ -23,6 +23,17 @@ class LP_TRANSFORMATIONS_API MarkupPrecisions; } // namespace ngraph // Transformation is used to add customization options runtime +/** + * @ingroup ie_transformation_common_api + * @brief MarkupPrecisions transformation marks: + * 1) not supported operations by PrecisionsAttribute attribute with empty precisions, + * 2) operations with required precisions by PrecisionsAttribute attribute according to the provided restrictions, + * 3) precision preserved operations by PrecisionPreservedAttribute attribute. + * + * For more details about the transformation, refer to + * [MarkupPrecisions](@ref openvino_docs_IE_DG_lpt_MarkupPrecisions) page + * in the Inference Engine Developer Guide. + */ class ngraph::pass::low_precision::MarkupPrecisions : public ngraph::pass::FunctionPass { public: class Restriction { diff --git a/src/common/low_precision_transformations/include/low_precision/mat_mul.hpp b/src/common/low_precision_transformations/include/low_precision/mat_mul.hpp index 067f82ea59b28b..a97e896bd30673 100644 --- a/src/common/low_precision_transformations/include/low_precision/mat_mul.hpp +++ b/src/common/low_precision_transformations/include/low_precision/mat_mul.hpp @@ -11,6 +11,14 @@ namespace ngraph { namespace pass { namespace low_precision { +/** + * @ingroup ie_transformation_common_api + * @brief MatMulTransformation propagates dequantization operations through MatMul operation. + * + * For more details about the transformation, refer to + * [MatMulTransformation](@ref openvino_docs_IE_DG_lpt_MatMulTransformation) page + * in the Inference Engine Developer Guide. + */ class LP_TRANSFORMATIONS_API MatMulTransformation : public LayerTransformation { public: NGRAPH_RTTI_DECLARATION; diff --git a/src/common/low_precision_transformations/include/low_precision/max_pool.hpp b/src/common/low_precision_transformations/include/low_precision/max_pool.hpp index ca2b8a08272817..dcea90fca82596 100644 --- a/src/common/low_precision_transformations/include/low_precision/max_pool.hpp +++ b/src/common/low_precision_transformations/include/low_precision/max_pool.hpp @@ -12,6 +12,14 @@ namespace ngraph { namespace pass { namespace low_precision { +/** + * @ingroup ie_transformation_common_api + * @brief MaxPoolTransformation propagates dequantization operations through MaxPool operation. + * + * For more details about the transformation, refer to + * [MaxPoolTransformation](@ref openvino_docs_IE_DG_lpt_MaxPoolTransformation) page + * in the Inference Engine Developer Guide. + */ class LP_TRANSFORMATIONS_API MaxPoolTransformation : public LayerTransformation { public: NGRAPH_RTTI_DECLARATION; diff --git a/src/common/low_precision_transformations/include/low_precision/multiply.hpp b/src/common/low_precision_transformations/include/low_precision/multiply.hpp index fee17230569c7b..aeec4e8b9d57c9 100644 --- a/src/common/low_precision_transformations/include/low_precision/multiply.hpp +++ b/src/common/low_precision_transformations/include/low_precision/multiply.hpp @@ -11,6 +11,14 @@ namespace ngraph { namespace pass { namespace low_precision { +/** + * @ingroup ie_transformation_common_api + * @brief MultiplyTransformation propagates dequantization operations through Multiply operation. + * + * For more details about the transformation, refer to + * [MultiplyTransformation](@ref openvino_docs_IE_DG_lpt_MultiplyTransformation) page + * in the Inference Engine Developer Guide. + */ class LP_TRANSFORMATIONS_API MultiplyTransformation : public EltwiseBaseTransformation { public: NGRAPH_RTTI_DECLARATION; diff --git a/src/common/low_precision_transformations/include/low_precision/multiply_to_group_convolution.hpp b/src/common/low_precision_transformations/include/low_precision/multiply_to_group_convolution.hpp index 5e6bd900d8ea9e..eb0122d390d34e 100644 --- a/src/common/low_precision_transformations/include/low_precision/multiply_to_group_convolution.hpp +++ b/src/common/low_precision_transformations/include/low_precision/multiply_to_group_convolution.hpp @@ -13,6 +13,14 @@ namespace ngraph { namespace pass { namespace low_precision { +/** + * @ingroup ie_transformation_common_api + * @brief MultiplyToGroupConvolutionTransformation replace quantized Multiply operations to GroupConvolution to speed up inference. + * + * For more details about the transformation, refer to + * [MultiplyToGroupConvolutionTransformation](@ref openvino_docs_IE_DG_lpt_MultiplyToGroupConvolutionTransformation) page + * in the Inference Engine Developer Guide. + */ class LP_TRANSFORMATIONS_API MultiplyToGroupConvolutionTransformation : public LayerTransformation { public: NGRAPH_RTTI_DECLARATION; diff --git a/src/common/low_precision_transformations/include/low_precision/mvn.hpp b/src/common/low_precision_transformations/include/low_precision/mvn.hpp index 42ddd6f0b620a1..a853ccf89118d6 100644 --- a/src/common/low_precision_transformations/include/low_precision/mvn.hpp +++ b/src/common/low_precision_transformations/include/low_precision/mvn.hpp @@ -10,6 +10,14 @@ namespace ngraph { namespace pass { namespace low_precision { +/** + * @ingroup ie_transformation_common_api + * @brief MVNTransformation propagates dequantization operations through MVN operation. + * + * For more details about the transformation, refer to + * [MVNTransformation](@ref openvino_docs_IE_DG_lpt_MVNTransformation) page + * in the Inference Engine Developer Guide. + */ class LP_TRANSFORMATIONS_API MVNTransformation : public LayerTransformation { public: NGRAPH_RTTI_DECLARATION; diff --git a/src/common/low_precision_transformations/include/low_precision/normalize_l2.hpp b/src/common/low_precision_transformations/include/low_precision/normalize_l2.hpp index 88a113cb38a49d..28250fadd21f00 100644 --- a/src/common/low_precision_transformations/include/low_precision/normalize_l2.hpp +++ b/src/common/low_precision_transformations/include/low_precision/normalize_l2.hpp @@ -10,6 +10,14 @@ namespace ngraph { namespace pass { namespace low_precision { +/** + * @ingroup ie_transformation_common_api + * @brief NormalizeL2Transformation propagates dequantization operations through NormalizeL2 operation. + * + * For more details about the transformation, refer to + * [NormalizeL2Transformation](@ref openvino_docs_IE_DG_lpt_NormalizeL2Transformation) page + * in the Inference Engine Developer Guide. + */ class LP_TRANSFORMATIONS_API NormalizeL2Transformation : public LayerTransformation { public: NGRAPH_RTTI_DECLARATION; diff --git a/src/common/low_precision_transformations/include/low_precision/pad.hpp b/src/common/low_precision_transformations/include/low_precision/pad.hpp index 66691f3871ac0a..ce01ca32b5d431 100644 --- a/src/common/low_precision_transformations/include/low_precision/pad.hpp +++ b/src/common/low_precision_transformations/include/low_precision/pad.hpp @@ -12,6 +12,14 @@ namespace ngraph { namespace pass { namespace low_precision { +/** + * @ingroup ie_transformation_common_api + * @brief PadTransformation propagates dequantization operations through Pad operation. + * + * For more details about the transformation, refer to + * [PadTransformation](@ref openvino_docs_IE_DG_lpt_PadTransformation) page + * in the Inference Engine Developer Guide. + */ class LP_TRANSFORMATIONS_API PadTransformation : public LayerTransformation { public: NGRAPH_RTTI_DECLARATION; diff --git a/src/common/low_precision_transformations/include/low_precision/prelu.hpp b/src/common/low_precision_transformations/include/low_precision/prelu.hpp index e58d4b25615752..e93d70a9078f40 100644 --- a/src/common/low_precision_transformations/include/low_precision/prelu.hpp +++ b/src/common/low_precision_transformations/include/low_precision/prelu.hpp @@ -12,6 +12,14 @@ namespace ngraph { namespace pass { namespace low_precision { +/** + * @ingroup ie_transformation_common_api + * @brief PReluTransformation propagates dequantization operations through PRelu operation. + * + * For more details about the transformation, refer to + * [PReluTransformation](@ref openvino_docs_IE_DG_lpt_PReluTransformation) page + * in the Inference Engine Developer Guide. + */ class LP_TRANSFORMATIONS_API PReluTransformation : public LayerTransformation { public: NGRAPH_RTTI_DECLARATION; diff --git a/src/common/low_precision_transformations/include/low_precision/propagate_precisions.hpp b/src/common/low_precision_transformations/include/low_precision/propagate_precisions.hpp index 5ed4f929026ce7..57e8eb07da3195 100644 --- a/src/common/low_precision_transformations/include/low_precision/propagate_precisions.hpp +++ b/src/common/low_precision_transformations/include/low_precision/propagate_precisions.hpp @@ -22,6 +22,14 @@ class LP_TRANSFORMATIONS_API PropagatePrecisions; } // namespace pass } // namespace ngraph +/** + * @ingroup ie_transformation_common_api + * @brief PropagatePrecisions transformation propagates PrecisionsAttribute attribute instances precision preserved operations. + * + * For more details about the transformation, refer to + * [PropagatePrecisions](@ref openvino_docs_IE_DG_lpt_PropagatePrecisions) page + * in the Inference Engine Developer Guide. + */ class ngraph::pass::low_precision::PropagatePrecisions : public ngraph::pass::FunctionPass { public: NGRAPH_RTTI_DECLARATION; diff --git a/src/common/low_precision_transformations/include/low_precision/propagate_shared_value.hpp b/src/common/low_precision_transformations/include/low_precision/propagate_shared_value.hpp index 3f05c0b3bf2929..2049d062a535bb 100644 --- a/src/common/low_precision_transformations/include/low_precision/propagate_shared_value.hpp +++ b/src/common/low_precision_transformations/include/low_precision/propagate_shared_value.hpp @@ -27,6 +27,15 @@ class LP_TRANSFORMATIONS_API PropagateSharedValue; } // namespace pass } // namespace ngraph +/** + * @ingroup ie_transformation_common_api + * @brief PropagateSharedValue transformation propagates shared value AttributeType attribute instances + * through precision preserved operations. + * + * For more details about the transformation, refer to + * [PropagateSharedValue](@ref openvino_docs_IE_DG_lpt_PropagateSharedValue) page + * in the Inference Engine Developer Guide. + */ template class ngraph::pass::low_precision::PropagateSharedValue : public ngraph::pass::FunctionPass { public: diff --git a/src/common/low_precision_transformations/include/low_precision/propagate_through_precision_preserved.hpp b/src/common/low_precision_transformations/include/low_precision/propagate_through_precision_preserved.hpp index 844e23bfb95ecc..cf2512e0a523ae 100644 --- a/src/common/low_precision_transformations/include/low_precision/propagate_through_precision_preserved.hpp +++ b/src/common/low_precision_transformations/include/low_precision/propagate_through_precision_preserved.hpp @@ -27,6 +27,15 @@ class PropagateThroughPrecisionPreserved; } // namespace pass } // namespace ngraph +/** + * @ingroup ie_transformation_common_api + * @brief PropagateThroughPrecisionPreserved transformation propagates AttributeType attribute instances + * through precision preserved operations. + * + * For more details about the transformation, refer to + * [PropagateThroughPrecisionPreserved](@ref openvino_docs_IE_DG_lpt_PropagateThroughPrecisionPreserved) page + * in the Inference Engine Developer Guide. + */ template class ngraph::pass::low_precision::PropagateThroughPrecisionPreserved : public ngraph::pass::MatcherPass { public: diff --git a/src/common/low_precision_transformations/include/low_precision/propagate_to_input.hpp b/src/common/low_precision_transformations/include/low_precision/propagate_to_input.hpp index 42f840c1573b8d..7bc661f292ae13 100644 --- a/src/common/low_precision_transformations/include/low_precision/propagate_to_input.hpp +++ b/src/common/low_precision_transformations/include/low_precision/propagate_to_input.hpp @@ -26,6 +26,15 @@ class PropagateToInput; } // namespace pass } // namespace ngraph +/** + * @ingroup ie_transformation_common_api + * @brief PropagateToInput transformation propagates AttributeType shared value attribute instances + * from parent output ports to consumers input ports. + * + * For more details about the transformation, refer to + * [PropagateToInput](@ref openvino_docs_IE_DG_lpt_PropagateToInput) page + * in the Inference Engine Developer Guide. + */ template class ngraph::pass::low_precision::PropagateToInput : public ngraph::pass::MatcherPass { public: diff --git a/src/common/low_precision_transformations/include/low_precision/pull_reshape_through_dequantization.hpp b/src/common/low_precision_transformations/include/low_precision/pull_reshape_through_dequantization.hpp index e8bc2add659a39..4a872e257b9802 100644 --- a/src/common/low_precision_transformations/include/low_precision/pull_reshape_through_dequantization.hpp +++ b/src/common/low_precision_transformations/include/low_precision/pull_reshape_through_dequantization.hpp @@ -19,6 +19,15 @@ class LP_TRANSFORMATIONS_API PullReshapeThroughDequantization; } // namespace pass } // namespace ngraph +/** + * @ingroup ie_transformation_common_api + * @brief PullReshapeThroughDequantization propagates dequantization operations through Reshape operations. + * The transformation is used on constant subgraph weights to prepare a model for the next low precision transformations. + * + * For more details about the transformation, refer to + * [PullReshapeThroughDequantization](@ref openvino_docs_IE_DG_lpt_PullReshapeThroughDequantization) page + * in the Inference Engine Developer Guide. + */ class ngraph::pass::low_precision::PullReshapeThroughDequantization : public ngraph::pass::MatcherPass { public: NGRAPH_RTTI_DECLARATION; diff --git a/src/common/low_precision_transformations/include/low_precision/pull_transpose_through_dequantization.hpp b/src/common/low_precision_transformations/include/low_precision/pull_transpose_through_dequantization.hpp index f9d957389e6e5a..973ec50e3c0802 100644 --- a/src/common/low_precision_transformations/include/low_precision/pull_transpose_through_dequantization.hpp +++ b/src/common/low_precision_transformations/include/low_precision/pull_transpose_through_dequantization.hpp @@ -19,6 +19,15 @@ class LP_TRANSFORMATIONS_API PullTransposeThroughDequantization; } // namespace pass } // namespace ngraph +/** + * @ingroup ie_transformation_common_api + * @brief PullTransposeThroughDequantization propagates dequantization operations through Transpose operations. + * The transformation is used on constant subgraph weights to prepare a model for the next low precision transformations. + * + * For more details about the transformation, refer to + * [PullTransposeThroughDequantization](@ref openvino_docs_IE_DG_lpt_PullTransposeThroughDequantization) page + * in the Inference Engine Developer Guide. + */ class ngraph::pass::low_precision::PullTransposeThroughDequantization : public ngraph::pass::MatcherPass { public: NGRAPH_RTTI_DECLARATION; diff --git a/src/common/low_precision_transformations/include/low_precision/reduce_base_transformation.hpp b/src/common/low_precision_transformations/include/low_precision/reduce_base_transformation.hpp index 0b9782e4eb207a..26c5eb340db074 100644 --- a/src/common/low_precision_transformations/include/low_precision/reduce_base_transformation.hpp +++ b/src/common/low_precision_transformations/include/low_precision/reduce_base_transformation.hpp @@ -13,11 +13,11 @@ namespace pass { namespace low_precision { /** -* @brief ReduceBaseTransformation: base class for Reduce*Transformation -* detects dequantization operations in front of the Reduce* layer and -* propagates them through the Reduce* if possible -* -*/ + * @ingroup ie_transformation_common_api + * @brief ReduceBaseTransformation: base class for Reduce*Transformation, + * detects dequantization operations in front of the Reduce* operation and + * propagates them through the Reduce* if possible. + */ class LP_TRANSFORMATIONS_API ReduceBaseTransformation : public LayerTransformation { public: diff --git a/src/common/low_precision_transformations/include/low_precision/reduce_max.hpp b/src/common/low_precision_transformations/include/low_precision/reduce_max.hpp index b9c2b98253ef82..a94d6937313a7b 100644 --- a/src/common/low_precision_transformations/include/low_precision/reduce_max.hpp +++ b/src/common/low_precision_transformations/include/low_precision/reduce_max.hpp @@ -14,6 +14,14 @@ namespace ngraph { namespace pass { namespace low_precision { +/** + * @ingroup ie_transformation_common_api + * @brief ReduceMaxTransformation propagates dequantization operations through ReduceMax operation. + * + * For more details about the transformation, refer to + * [ReduceMaxTransformation](@ref openvino_docs_IE_DG_lpt_ReduceMaxTransformation) page + * in the Inference Engine Developer Guide. + */ class LP_TRANSFORMATIONS_API ReduceMaxTransformation : public ReduceBaseTransformation { public: NGRAPH_RTTI_DECLARATION; diff --git a/src/common/low_precision_transformations/include/low_precision/reduce_mean.hpp b/src/common/low_precision_transformations/include/low_precision/reduce_mean.hpp index 31f542a37548b2..fd2e8cb1e69856 100644 --- a/src/common/low_precision_transformations/include/low_precision/reduce_mean.hpp +++ b/src/common/low_precision_transformations/include/low_precision/reduce_mean.hpp @@ -14,6 +14,14 @@ namespace ngraph { namespace pass { namespace low_precision { +/** + * @ingroup ie_transformation_common_api + * @brief ReduceMeanTransformation propagates dequantization operations through ReduceMean operation. + * + * For more details about the transformation, refer to + * [ReduceMeanTransformation](@ref openvino_docs_IE_DG_lpt_ReduceMeanTransformation) page + * in the Inference Engine Developer Guide. + */ class LP_TRANSFORMATIONS_API ReduceMeanTransformation : public ReduceBaseTransformation { public: NGRAPH_RTTI_DECLARATION; diff --git a/src/common/low_precision_transformations/include/low_precision/reduce_min.hpp b/src/common/low_precision_transformations/include/low_precision/reduce_min.hpp index e4ccdeab97e74a..fa203fec71c700 100644 --- a/src/common/low_precision_transformations/include/low_precision/reduce_min.hpp +++ b/src/common/low_precision_transformations/include/low_precision/reduce_min.hpp @@ -14,6 +14,14 @@ namespace ngraph { namespace pass { namespace low_precision { +/** + * @ingroup ie_transformation_common_api + * @brief ReduceMinTransformation propagates dequantization operations through ReduceMin operation. + * + * For more details about the transformation, refer to + * [ReduceMinTransformation](@ref openvino_docs_IE_DG_lpt_ReduceMinTransformation) page + * in the Inference Engine Developer Guide. + */ class LP_TRANSFORMATIONS_API ReduceMinTransformation : public ReduceBaseTransformation { public: NGRAPH_RTTI_DECLARATION; diff --git a/src/common/low_precision_transformations/include/low_precision/reduce_sum.hpp b/src/common/low_precision_transformations/include/low_precision/reduce_sum.hpp index 82e8dd2888321f..ac37fa47ca6523 100644 --- a/src/common/low_precision_transformations/include/low_precision/reduce_sum.hpp +++ b/src/common/low_precision_transformations/include/low_precision/reduce_sum.hpp @@ -14,6 +14,14 @@ namespace ngraph { namespace pass { namespace low_precision { +/** + * @ingroup ie_transformation_common_api + * @brief ReduceSumTransformation propagates dequantization operations through ReduceSum operation. + * + * For more details about the transformation, refer to + * [ReduceSumTransformation](@ref openvino_docs_IE_DG_lpt_ReduceSumTransformation) page + * in the Inference Engine Developer Guide. + */ class LP_TRANSFORMATIONS_API ReduceSumTransformation : public ReduceBaseTransformation { public: NGRAPH_RTTI_DECLARATION; diff --git a/src/common/low_precision_transformations/include/low_precision/relu.hpp b/src/common/low_precision_transformations/include/low_precision/relu.hpp index 1f7489a73d8337..fdca5d5cafa818 100644 --- a/src/common/low_precision_transformations/include/low_precision/relu.hpp +++ b/src/common/low_precision_transformations/include/low_precision/relu.hpp @@ -12,6 +12,14 @@ namespace ngraph { namespace pass { namespace low_precision { +/** + * @ingroup ie_transformation_common_api + * @brief ReluTransformation propagates dequantization operations through Relu operation. + * + * For more details about the transformation, refer to + * [ReluTransformation](@ref openvino_docs_IE_DG_lpt_ReluTransformation) page + * in the Inference Engine Developer Guide. + */ class LP_TRANSFORMATIONS_API ReluTransformation : public LayerTransformation { public: NGRAPH_RTTI_DECLARATION; diff --git a/src/common/low_precision_transformations/include/low_precision/reshape.hpp b/src/common/low_precision_transformations/include/low_precision/reshape.hpp index cb1b3a28456f03..53b904d87095e0 100644 --- a/src/common/low_precision_transformations/include/low_precision/reshape.hpp +++ b/src/common/low_precision_transformations/include/low_precision/reshape.hpp @@ -11,6 +11,14 @@ namespace ngraph { namespace pass { namespace low_precision { +/** + * @ingroup ie_transformation_common_api + * @brief ReshapeTransformation propagates dequantization operations through Reshape operation. + * + * For more details about the transformation, refer to + * [ReshapeTransformation](@ref openvino_docs_IE_DG_lpt_ReshapeTransformation) page + * in the Inference Engine Developer Guide. + */ class LP_TRANSFORMATIONS_API ReshapeTransformation : public LayerTransformation { public: NGRAPH_RTTI_DECLARATION; diff --git a/src/common/low_precision_transformations/include/low_precision/rt_info/avg_pool_precision_preserved_attribute.hpp b/src/common/low_precision_transformations/include/low_precision/rt_info/avg_pool_precision_preserved_attribute.hpp index 9da87cba5cef15..2e2a83cd2610ea 100644 --- a/src/common/low_precision_transformations/include/low_precision/rt_info/avg_pool_precision_preserved_attribute.hpp +++ b/src/common/low_precision_transformations/include/low_precision/rt_info/avg_pool_precision_preserved_attribute.hpp @@ -13,6 +13,15 @@ #include "low_precision/rt_info/precision_preserved_attribute.hpp" namespace ngraph { + +/** + * @ingroup ie_transformation_common_api + * @brief AvgPoolPrecisionPreservedAttribute is utility attribute which is used only during `AvgPool` operation precision + * preserved property definition. + * + * For more details about the attribute, refer to + * [AvgPoolPrecisionPreservedAttribute](@ref openvino_docs_IE_DG_lpt_AvgPoolPrecisionPreserved) page in the Inference Engine Developer Guide. + */ class LP_TRANSFORMATIONS_API AvgPoolPrecisionPreservedAttribute : public PrecisionPreservedAttribute { public: OPENVINO_RTTI("LowPrecision::AvgPoolPrecisionPreserved", "", ov::RuntimeAttribute, 0); diff --git a/src/common/low_precision_transformations/include/low_precision/rt_info/intervals_alignment_attribute.hpp b/src/common/low_precision_transformations/include/low_precision/rt_info/intervals_alignment_attribute.hpp index af8664110d30f3..f24593dfbb91fb 100644 --- a/src/common/low_precision_transformations/include/low_precision/rt_info/intervals_alignment_attribute.hpp +++ b/src/common/low_precision_transformations/include/low_precision/rt_info/intervals_alignment_attribute.hpp @@ -15,6 +15,10 @@ #include "low_precision/lpt_visibility.hpp" namespace ngraph { +/** + * @ingroup ie_transformation_common_api + * @brief IntervalsAlignmentSharedValue is used by IntervalsAlignmentAttribute as attribute shared value. + */ class LP_TRANSFORMATIONS_API IntervalsAlignmentSharedValue { public: class Interval { @@ -45,6 +49,14 @@ class LP_TRANSFORMATIONS_API IntervalsAlignmentSharedValue { #endif }; +/** + * @ingroup ie_transformation_common_api + * @brief IntervalsAlignmentAttribute defines subgraph with the same quantization intervals alignment. + * FakeQuantize operations are included. The attribute is used by quantization operations. + * + * For more details about the attribute, refer to + * [IntervalsAlignmentAttribute](@ref openvino_docs_IE_DG_lpt_IntervalsAlignment) page in the Inference Engine Developer Guide. + */ class LP_TRANSFORMATIONS_API IntervalsAlignmentAttribute : public SharedAttribute { public: OPENVINO_RTTI("LowPrecision::IntervalsAlignment", "", ov::RuntimeAttribute, 0); diff --git a/src/common/low_precision_transformations/include/low_precision/rt_info/per_tensor_quantization_attribute.hpp b/src/common/low_precision_transformations/include/low_precision/rt_info/per_tensor_quantization_attribute.hpp index 9a991dbf684c87..bef9a3157fd3c8 100644 --- a/src/common/low_precision_transformations/include/low_precision/rt_info/per_tensor_quantization_attribute.hpp +++ b/src/common/low_precision_transformations/include/low_precision/rt_info/per_tensor_quantization_attribute.hpp @@ -14,6 +14,13 @@ #include "attribute_parameters.hpp" namespace ngraph { +/** + * @ingroup ie_transformation_common_api + * @brief PerTensorQuantizationAttribute defines if operation input port requires per-tensor quantization. + * + * For more details about the attribute, refer to + * [PerTensorQuantizationAttribute](@ref openvino_docs_IE_DG_lpt_PerTensorQuantization) page in the Inference Engine Developer Guide. + */ class LP_TRANSFORMATIONS_API PerTensorQuantizationAttribute : public ov::RuntimeAttribute { public: OPENVINO_RTTI("LowPrecision::PerTensorQuantization", "", ov::RuntimeAttribute, 0); diff --git a/src/common/low_precision_transformations/include/low_precision/rt_info/precision_preserved_attribute.hpp b/src/common/low_precision_transformations/include/low_precision/rt_info/precision_preserved_attribute.hpp index 3a82a18d979908..3f2bbec9977f9f 100644 --- a/src/common/low_precision_transformations/include/low_precision/rt_info/precision_preserved_attribute.hpp +++ b/src/common/low_precision_transformations/include/low_precision/rt_info/precision_preserved_attribute.hpp @@ -13,6 +13,14 @@ #include "low_precision/rt_info/shared_value_attribute.hpp" namespace ngraph { +/** + * @ingroup ie_transformation_common_api + * @brief PrecisionPreservedAttribute defines the precision preserved operation. If the attribute is absent, then an operation is + * not precision preserved. + * + * For more details about the attribute, refer to + * [PrecisionPreservedAttribute](@ref openvino_docs_IE_DG_lpt_PrecisionPreserved) page in the Inference Engine Developer Guide. + */ class LP_TRANSFORMATIONS_API PrecisionPreservedAttribute : public SharedAttribute { public: OPENVINO_RTTI("LowPrecision::PrecisionPreserved", "", ov::RuntimeAttribute, 0); diff --git a/src/common/low_precision_transformations/include/low_precision/rt_info/precisions_attribute.hpp b/src/common/low_precision_transformations/include/low_precision/rt_info/precisions_attribute.hpp index 4945c4e06405ed..264494427d6401 100644 --- a/src/common/low_precision_transformations/include/low_precision/rt_info/precisions_attribute.hpp +++ b/src/common/low_precision_transformations/include/low_precision/rt_info/precisions_attribute.hpp @@ -18,7 +18,13 @@ #include "low_precision/rt_info/shared_value_attribute.hpp" namespace ngraph { - +/** + * @ingroup ie_transformation_common_api + * @brief PrecisionsAttribute defines precision which is required for input/output port or an operation. + * + * For more details about the attribute, refer to + * [PrecisionsAttribute](@ref openvino_docs_IE_DG_lpt_Precisions) page in the Inference Engine Developer Guide. + */ class LP_TRANSFORMATIONS_API PrecisionsAttribute : public SharedAttribute> { public: OPENVINO_RTTI("LowPrecision::Precisions", "", ov::RuntimeAttribute, 0); diff --git a/src/common/low_precision_transformations/include/low_precision/rt_info/quantization_alignment_attribute.hpp b/src/common/low_precision_transformations/include/low_precision/rt_info/quantization_alignment_attribute.hpp index d8a09ccdc54389..96f5b9ec02bd24 100644 --- a/src/common/low_precision_transformations/include/low_precision/rt_info/quantization_alignment_attribute.hpp +++ b/src/common/low_precision_transformations/include/low_precision/rt_info/quantization_alignment_attribute.hpp @@ -18,6 +18,14 @@ #include "attribute_parameters.hpp" namespace ngraph { +/** + * @ingroup ie_transformation_common_api + * @brief QuantizationAlignmentAttribute defines subgraph with the same quantization alignment. + * FakeQuantize operations are not included. The attribute is used by quantization operations. + * + * For more details about the attribute, refer to + * [QuantizationAlignmentAttribute](@ref openvino_docs_IE_DG_lpt_QuantizationAlignment) page in the Inference Engine Developer Guide. + */ class LP_TRANSFORMATIONS_API QuantizationAlignmentAttribute : public SharedAttribute { public: OPENVINO_RTTI("LowPrecision::QuantizationAlignment", "", ov::RuntimeAttribute, 0); diff --git a/src/common/low_precision_transformations/include/low_precision/rt_info/shared_value_attribute.hpp b/src/common/low_precision_transformations/include/low_precision/rt_info/shared_value_attribute.hpp index 5b922c07b5cd44..5f829d132b7f3e 100644 --- a/src/common/low_precision_transformations/include/low_precision/rt_info/shared_value_attribute.hpp +++ b/src/common/low_precision_transformations/include/low_precision/rt_info/shared_value_attribute.hpp @@ -18,6 +18,12 @@ template class LP_TRANSFORMATIONS_API SharedAttribute : public ov::RuntimeAttribute { public: virtual ~SharedAttribute() = default; + + /** + * @ingroup ie_transformation_common_api + * @brief SharedValueAttribute type for shared value attributes. + * The attribute is used for attribute SharedValue value backward propagation. + */ class LP_TRANSFORMATIONS_API SharedValueAttribute : public std::enable_shared_from_this { public: struct LP_TRANSFORMATIONS_API SharedValue : public std::enable_shared_from_this { diff --git a/src/common/low_precision_transformations/include/low_precision/shuffle_channels.hpp b/src/common/low_precision_transformations/include/low_precision/shuffle_channels.hpp index ab28d754598e67..f5dd05fc8bce8a 100644 --- a/src/common/low_precision_transformations/include/low_precision/shuffle_channels.hpp +++ b/src/common/low_precision_transformations/include/low_precision/shuffle_channels.hpp @@ -11,6 +11,14 @@ namespace ngraph { namespace pass { namespace low_precision { +/** + * @ingroup ie_transformation_common_api + * @brief ShuffleChannelsTransformation propagates dequantization operations through ShuffleChannels operation. + * + * For more details about the transformation, refer to + * [ShuffleChannelsTransformation](@ref openvino_docs_IE_DG_lpt_ShuffleChannelsTransformation) page + * in the Inference Engine Developer Guide. + */ class LP_TRANSFORMATIONS_API ShuffleChannelsTransformation : public LayerTransformation { public: NGRAPH_RTTI_DECLARATION; diff --git a/src/common/low_precision_transformations/include/low_precision/split.hpp b/src/common/low_precision_transformations/include/low_precision/split.hpp index d4f2c72b8beb7b..e85b5ed2dde8ab 100644 --- a/src/common/low_precision_transformations/include/low_precision/split.hpp +++ b/src/common/low_precision_transformations/include/low_precision/split.hpp @@ -13,6 +13,14 @@ namespace ngraph { namespace pass { namespace low_precision { +/** + * @ingroup ie_transformation_common_api + * @brief SplitTransformation propagates dequantization operations through Split operation. + * + * For more details about the transformation, refer to + * [SplitTransformation](@ref openvino_docs_IE_DG_lpt_SplitTransformation) page + * in the Inference Engine Developer Guide. + */ class LP_TRANSFORMATIONS_API SplitTransformation : public LayerTransformation { public: NGRAPH_RTTI_DECLARATION; diff --git a/src/common/low_precision_transformations/include/low_precision/squeeze.hpp b/src/common/low_precision_transformations/include/low_precision/squeeze.hpp index fab050564c8bc0..2bac4300c14ab8 100644 --- a/src/common/low_precision_transformations/include/low_precision/squeeze.hpp +++ b/src/common/low_precision_transformations/include/low_precision/squeeze.hpp @@ -11,6 +11,14 @@ namespace ngraph { namespace pass { namespace low_precision { +/** + * @ingroup ie_transformation_common_api + * @brief SqueezeTransformation propagates dequantization operations through Squeeze operation. + * + * For more details about the transformation, refer to + * [SqueezeTransformation](@ref openvino_docs_IE_DG_lpt_SqueezeTransformation) page + * in the Inference Engine Developer Guide. + */ class LP_TRANSFORMATIONS_API SqueezeTransformation : public LayerTransformation { public: NGRAPH_RTTI_DECLARATION; diff --git a/src/common/low_precision_transformations/include/low_precision/strided_slice.hpp b/src/common/low_precision_transformations/include/low_precision/strided_slice.hpp index 5a0520f54ae9b1..cf7bc52f4086ef 100644 --- a/src/common/low_precision_transformations/include/low_precision/strided_slice.hpp +++ b/src/common/low_precision_transformations/include/low_precision/strided_slice.hpp @@ -12,6 +12,14 @@ namespace ngraph { namespace pass { namespace low_precision { +/** + * @ingroup ie_transformation_common_api + * @brief StridedSliceTransformation propagates dequantization operations through StridedSlice operation. + * + * For more details about the transformation, refer to + * [StridedSliceTransformation](@ref openvino_docs_IE_DG_lpt_StridedSliceTransformation) page + * in the Inference Engine Developer Guide. + */ class LP_TRANSFORMATIONS_API StridedSliceTransformation : public LayerTransformation { public: NGRAPH_RTTI_DECLARATION; diff --git a/src/common/low_precision_transformations/include/low_precision/subtract.hpp b/src/common/low_precision_transformations/include/low_precision/subtract.hpp index 56c66d9945040b..4d15b62c6c27d0 100644 --- a/src/common/low_precision_transformations/include/low_precision/subtract.hpp +++ b/src/common/low_precision_transformations/include/low_precision/subtract.hpp @@ -11,6 +11,14 @@ namespace ngraph { namespace pass { namespace low_precision { +/** + * @ingroup ie_transformation_common_api + * @brief SubtractTransformation propagates dequantization operations through Subtract operation. + * + * For more details about the transformation, refer to + * [SubtractTransformation](@ref openvino_docs_IE_DG_lpt_SubtractTransformation) page + * in the Inference Engine Developer Guide. + */ class LP_TRANSFORMATIONS_API SubtractTransformation : public LayerTransformation { public: NGRAPH_RTTI_DECLARATION; diff --git a/src/common/low_precision_transformations/include/low_precision/transformation_context.hpp b/src/common/low_precision_transformations/include/low_precision/transformation_context.hpp index 1aad5e55bd648e..9a2473a71f67b3 100644 --- a/src/common/low_precision_transformations/include/low_precision/transformation_context.hpp +++ b/src/common/low_precision_transformations/include/low_precision/transformation_context.hpp @@ -13,6 +13,10 @@ namespace ngraph { namespace pass { namespace low_precision { +/** + * @ingroup ie_transformation_common_api + * @brief TransformationContext instance is used to pass model transformation context data between transformations. + */ class LP_TRANSFORMATIONS_API TransformationContext { public: TransformationContext(); diff --git a/src/common/low_precision_transformations/include/low_precision/transparent_base_transformation.hpp b/src/common/low_precision_transformations/include/low_precision/transparent_base_transformation.hpp index d1f87f92f862b3..b9a3454b4b7476 100644 --- a/src/common/low_precision_transformations/include/low_precision/transparent_base_transformation.hpp +++ b/src/common/low_precision_transformations/include/low_precision/transparent_base_transformation.hpp @@ -12,6 +12,10 @@ namespace ngraph { namespace pass { namespace low_precision { +/** + * @ingroup ie_transformation_common_api + * @brief TransparentBaseTransformation is base type for precision preserved operation transformation. + */ class LP_TRANSFORMATIONS_API TransparentBaseTransformation : public LayerTransformation { public: TransparentBaseTransformation(const Params& params) : LayerTransformation(params) {} diff --git a/src/common/low_precision_transformations/include/low_precision/transpose.hpp b/src/common/low_precision_transformations/include/low_precision/transpose.hpp index d22fcc8ed8cf36..f9eadc075782f8 100644 --- a/src/common/low_precision_transformations/include/low_precision/transpose.hpp +++ b/src/common/low_precision_transformations/include/low_precision/transpose.hpp @@ -12,6 +12,14 @@ namespace ngraph { namespace pass { namespace low_precision { +/** + * @ingroup ie_transformation_common_api + * @brief TransposeTransformation propagates dequantization operations through Transpose operation. + * + * For more details about the transformation, refer to + * [TransposeTransformation](@ref openvino_docs_IE_DG_lpt_TransposeTransformation) page + * in the Inference Engine Developer Guide. + */ class LP_TRANSFORMATIONS_API TransposeTransformation : public LayerTransformation { public: NGRAPH_RTTI_DECLARATION; diff --git a/src/common/low_precision_transformations/include/low_precision/unsqueeze.hpp b/src/common/low_precision_transformations/include/low_precision/unsqueeze.hpp index 580c09ad80bcce..92e4e2671e05e8 100644 --- a/src/common/low_precision_transformations/include/low_precision/unsqueeze.hpp +++ b/src/common/low_precision_transformations/include/low_precision/unsqueeze.hpp @@ -11,6 +11,14 @@ namespace ngraph { namespace pass { namespace low_precision { +/** + * @ingroup ie_transformation_common_api + * @brief UnsqueezeTransformation propagates dequantization operations through Unsqueeze operation. + * + * For more details about the transformation, refer to + * [UnsqueezeTransformation](@ref openvino_docs_IE_DG_lpt_UnsqueezeTransformation) page + * in the Inference Engine Developer Guide. + */ class LP_TRANSFORMATIONS_API UnsqueezeTransformation : public LayerTransformation { public: NGRAPH_RTTI_DECLARATION; diff --git a/src/common/low_precision_transformations/include/low_precision/update_shared_precision_preserved.hpp b/src/common/low_precision_transformations/include/low_precision/update_shared_precision_preserved.hpp index 42745f8f793299..e7899731d9a9da 100644 --- a/src/common/low_precision_transformations/include/low_precision/update_shared_precision_preserved.hpp +++ b/src/common/low_precision_transformations/include/low_precision/update_shared_precision_preserved.hpp @@ -25,6 +25,15 @@ class UpdateSharedPrecisionPreserved; } // namespace pass } // namespace ngraph +/** + * @ingroup ie_transformation_common_api + * @brief UpdateSharedPrecisionPreserved transformation updates shared AttributeType attribute instance value to true + * for precision preserved operations if ExpectedAttributeType exist. + * + * For more details about the transformation, refer to + * [UpdateSharedPrecisionPreserved](@ref openvino_docs_IE_DG_lpt_UpdateSharedPrecisionPreserved) page + * in the Inference Engine Developer Guide. + */ template class ngraph::pass::low_precision::UpdateSharedPrecisionPreserved : public ngraph::pass::MatcherPass { public: @@ -76,7 +85,7 @@ class ngraph::pass::low_precision::UpdateSharedPrecisionPreserved : public ngrap return true; }; - auto matcher = std::make_shared(pattern::any_input(), "PropagateThroughPrecisionPreserved"); + auto matcher = std::make_shared(pattern::any_input(), "UpdateSharedPrecisionPreserved"); this->register_matcher(matcher, callback); } diff --git a/src/common/low_precision_transformations/include/low_precision/variadic_split.hpp b/src/common/low_precision_transformations/include/low_precision/variadic_split.hpp index 014b3775fe75b8..2b45d001023c87 100644 --- a/src/common/low_precision_transformations/include/low_precision/variadic_split.hpp +++ b/src/common/low_precision_transformations/include/low_precision/variadic_split.hpp @@ -13,6 +13,14 @@ namespace ngraph { namespace pass { namespace low_precision { +/** + * @ingroup ie_transformation_common_api + * @brief VariadicSplitTransformation propagates dequantization operations through VariadicSplit operation. + * + * For more details about the transformation, refer to + * [VariadicSplitTransformation](@ref openvino_docs_IE_DG_lpt_VariadicSplitTransformation) page + * in the Inference Engine Developer Guide. + */ class LP_TRANSFORMATIONS_API VariadicSplitTransformation : public SplitTransformation { public: NGRAPH_RTTI_DECLARATION; diff --git a/src/common/low_precision_transformations/include/low_precision/weightable_layer_transformation.hpp b/src/common/low_precision_transformations/include/low_precision/weightable_layer_transformation.hpp index e045190aae4658..bee3137907b555 100644 --- a/src/common/low_precision_transformations/include/low_precision/weightable_layer_transformation.hpp +++ b/src/common/low_precision_transformations/include/low_precision/weightable_layer_transformation.hpp @@ -13,6 +13,10 @@ namespace ngraph { namespace pass { namespace low_precision { +/** + * @ingroup ie_transformation_common_api + * @brief WeightableLayerTransformation is base type for weightable operation transformation. + */ class LP_TRANSFORMATIONS_API WeightableLayerTransformation : public LayerTransformation{ public: WeightableLayerTransformation(const Params& params); diff --git a/src/common/transformations/include/transformations/common_optimizations/lin_op_sequence_fusion.hpp b/src/common/transformations/include/transformations/common_optimizations/lin_op_sequence_fusion.hpp index 9f7f1587d1125f..eaf9d1c846efd1 100644 --- a/src/common/transformations/include/transformations/common_optimizations/lin_op_sequence_fusion.hpp +++ b/src/common/transformations/include/transformations/common_optimizations/lin_op_sequence_fusion.hpp @@ -39,6 +39,10 @@ class ngraph::pass::MultiplyMultiplyFusion: public ngraph::pass::MatcherPass { MultiplyMultiplyFusion(); }; +/** + * @ingroup ie_transformation_common_api + * @brief LinOpSequenceFusion transformation fuses linear operation sequence. + */ class ngraph::pass::LinOpSequenceFusion: public ngraph::pass::GraphRewrite { public: NGRAPH_RTTI_DECLARATION;