Add support for inference on ONNX and TensorFlow models #370
Labels
area - model inference
ML Model support
area - runtime
The Rust Rune runtime
category - enhancement
New feature or request
priority - on-demand
This won't be touched until there is an external need for it (i.e. required by a customer)
I think almost everyone on the HOTG team has expressed a desire to use more ML frameworks at some point, in particular ONNX and Tensor Flow. However, I was reluctant to use bindings that go through their official C++ implementations after seeing how much trouble we had integrating TensorFlow Lite.
When I was playing around with
hotg-ai/wasi-nn-experiment
I came across a pure Rust implementation TensorFlow and ONNX inference calledtract
. This was able to cross-compile toaarch64-linux-android
andwasm32-unknown-unknown
without any extra work.By using
tract
instead of the reference implementations we'll be giving up some performance, reliability, and features (e.g. missing model ops) in exchange for long term maintainability and reduced build complexity. @f0rodo may want to comment on this trade-off, but from an engineering perspective I think it's worth it.The things we'll need to support new model types:
args
field to models inside the Runefile (done)format
argument which is either"tensorflow-lite"
,"tensorflow"
, or"onnx"
to specify what type of model this is (default is"tensorflow-lite"
if not provided) (example)format
into amimetype
that gets embedded in the Rune and passed to the runtime when loading a model (conversion, injecting into the generated Rune)ModelFactory
implementations for handling TensorFlow and ONNX modelsBaseImage::with_defaults()
(maybe hide them behind a feature flag like we did with"tensorflow-lite"
so users can cut down on dependencies, it's up to you)The text was updated successfully, but these errors were encountered: