Skip to content

Glow Roadmap

qcolombet edited this page Nov 17, 2018 · 12 revisions

This page tracks the ongoing development of Glow. It documents the goals for upcoming development iterations, the status of some high-level tasks, and relevant information that can help people join the ongoing efforts.

Top-Level Tasks

Load additional quantized neural networks

Quantization is the process of converting neural networks that are programmed using 32-bit floating point operations to using 8-bit integer arithmetic. Glow can quantize existing floating point networks using Profile Guided Quantization and then run the quantized model for inference. Glow starts to support loading quantized Caffe2/ONNX models directly. The goal of this top-level task is to extend the loader support to additional quantized Caffe2 operators (https://github.com/pytorch/pytorch/tree/master/caffe2/quantization/server) and ONNX operators.

Contact person: @beicy Issue: Support directly loading a quantized model

Asynchronous Model Execution

Glow is designed as a compiler and execution engine for neural network hardware accelerators. The current implementation of the execution engine is very basic and exposes a simple single-device synchronous run method. The goal for this top-level task is to rewrite the execution engine and implement an asynchronous execution mechanism that can be extended to support execution of code on multiple acceleration units concurrently. The execution engine will need to manage the state of multiple cards, queue requests and manage the complex state of buffers on the host and the device.

Contact person: @qcolombet Issue: Glow Runtime

ONNXIFI integration

Glow integrates into PyTorch using the ONNXIFI interface. This interface offloads the compute graph from PyTorch onto Glow. This top-level task tracks the work to fully implement the ONNXIFI specification and to qualify the compiler using the ONNXIFI test suite.

Contact person: @rdzhabarov Issue: TBD