Skip to content

Glow Roadmap

Nadav Rotem edited this page Nov 14, 2018 · 12 revisions

This page tracks the ongoing development of Glow. It documents the goals for upcoming development iterations, the status of some high-level tasks, and relevant information that can help people join the ongoing efforts.

Top-Level Tasks

Load additional quantized neural networks

Quantization is the process of converting neural networks that are programmed using 32-bit floating point operations to using 8-bit integer arithmetic. Glow can load quantized models in ONNX and Caffe2 format. It can also quantize existing floating point networks using Profile Guided Quantization. The goal of this top-level task is to extend the loader support to additional quantized ONNX operators.

Contact person: @beicy Issue: TBD

Asynchronous Model Execution

Glow is designed as a compiler and execution engine for neural network hardware accelerators. The current implementation of the execution engine is very basic and exposes a simple single-device synchronous run method. The goal for this top-level task is to rewrite the execution engine and implement an asynchronous execution mechanism that can be extended to support execution of code on multiple acceleration units concurrently. The execution engine will need to manage the state of multiple cards, queue requests and manage the complex state of buffers on the host and the device.

Contact person: @qcolombet Issue: TBD

ONNX-IFI integration

Glow integrates into PyTorch using the ONNX-IFI interface. This interface offloads the compute graph from PyTorch onto Glow. This top-level task tracks the work to fully implement the ONNX-IFI specification and to qualify the compiler using the ONNX-IFI test suite.

Contact person: @rdzhabarov Issue: TBD