OpenVINO™ Model Server 2022.3
The 2022.3 version is a major release. It includes several new features, enhancements and bug fixes.
New Features
Import TensorFlow Models – preview feature
OpenVINO Model Server can now load TensorFlow models directly from the model repository. Converting to OpenVINO Intermediate Representation (IR) format with model optimizer is not required. This is a preview feature with several limitations. The model must be in a frozen graph format with .pb extension. Loaded models take advantage of all OpenVINO optimizations. Learn more about it and check this demo.
C API interface to the model server internal functions – preview feature
It is now possible to leverage the model management functionality in OpenVINO Model Server for local inference execution within an application. Just dynamically link the OVMS shared library to take advantage of its new C API and use internal model server functions in C/C++ applications. To learn more see the documentation and check this demo.
Extended KServe gRPC API
The KServe gRPC API implemented in OpenVINO Model Server has been extended to support both input and output in format of Tensor data and raw data. Output format is consistent with the input format. This extension enables using Triton Client library with OpenVINO Model Server to send inference requests. The input data can be prepared as vectors or encoded as jpeg/png and sent as bytes. Learn more about the current API and check Python and C++ samples.
Extended KServe REST API
The KServe REST API now has additional functionality that improves compatibility with Triton Inference Server extension. It is now possible to send raw data in an HTTP request outside of the JSON content. Concatenated bytes can be interpreted by the model server depending on the header content. It is easy and quick to serialize the data from numpy/vectors and send jpeg/png encoded images.
Added Support for Intel® Data Center GPU Flex and Intel® Arc GPU
OpenVINO Model Server has now official support for Intel® Data Center GPU Flex and Intel® Arc GPU cards. Learn more about using discrete GPU devices.
C++ Sample Inference Client Applications using KServe API
New client code samples to demonstrate KServe API usage. These samples illustrate typical data formats and scenarios. Check out the samples.
Extended Python Client Samples using KServe API
Python client code samples have been extended to include new API features for both the gRPC and REST interfaces
Added integration with OpenVINO plugin for NVIDIA GPU
OpenVINO Model Server can now be used also with NVIDIA GPU cards. Follow those steps to build the Model Server from sources including NVIDIA plugin from openvino_contrib repo. Learn more about using NVIDIA plugin
Breaking changes
- CLI parameter has been changed to reflect interval time unit: custom_node_resources_cleaner_interval_seconds. Default value should be optimal for most use cases.
- Temporarily there is no support for HDDL/NCS plugins. Support for those will come in next release.
Deprecated functionality
- Plugin config parameters from OpenVINO API 1.0 – OpenVINO Model can be tuned using plugin config parameters. So far, the parameter names are defined by OpenVINO API 1.0. It is recommended to start using the parameter names defined in OpenVINO API 2.0. In this release old parameters are automatically translated to new substitutions. Check performance tuning guide and more info about the plugin parameters.
Bug fixes
- Improved performance for DAG pipelines executed on GPU accelerators
- The default number of performance tuning parameters was not calculated correctly inside docker containers with constrained CPU capacity. Now the number of optimal streams for THROUGHPUT mode will be set based on the bound CPU in the container.
- Fixes in unit tests raising sporadic false positive errors.
Other changes:
- Published binary package of OpenVINO Model Server which can be used in the deployments on baremetal hosts without Docker containers. See instructions for baremetal deployment.
- Updated software dependencies and container base images
You can use an OpenVINO Model Server public Docker image's based on Ubuntu via the following command:
docker pull openvino/model_server:2022.3
docker pull openvino/model_server:2022.3-gpu
or use provided binary packages.