Skip to content

HybridBackend v0.6.0

Compare
Choose a tag to compare
@2sin18 2sin18 released this 16 Apr 04:23
· 54 commits to main since this release

Objectives:

  1. Communication-efficient training and evaluation at scale
  2. Easy to use with existing AI workflows

Features:

  1. Data-Parallel Training and Evaluation
    - Bucketized Gradients Aggregation using AllReduce
    - Global Metric Operations
    - Out-Of-Range Coordination

  2. Hybrid-Parallel Embedding Learning
    - Bucketized Embedding Exchanging using AllToAllv
    - Fusion and Quantization of AllToAllv
    - Fusion of Partitioning and Stitching

  3. Usability
    - Support of MonitoredSession and Estimator
    - Declarative API for Model Definition

  4. Compatibility
    - Support of NVIDIA TensorFlow and DeepRec

  5. Interoperability
    - Inference Pipeline Needs No Change
    - Support of SavedModel
    - Support of Variable, XDL HashTable and PAI Embedding Variable

Bug Fixes:

[#46] Fixes rebatching in ParquetDataset.