I am Yi Zhu, currently working as a Senior Research Engineer in the System Research Group at MSRA. My primary research focus is on distributed training and inference for deep learning.
- Core Contributor to nnScaler
- Played a key role in the early development of the
autodist
module, enabling efficient and feasible execution plans by considering memory consumption of deep learning models. - Contributed to YOCO and Diff Transformer, particularly in the distributed training phase for models with long sequences.
- Participated in rStar-Math, providing suggestions for distributed training and inference.
- Played a key role in the early development of the
- Early Contributor to the Open-Source Distributed Training Framework OneFlow
If you are interested in cutting-edge research problems in Machine Learning Systems and would like to join the MSRA System Research Group for internships, full-time positions, or academic collaboration, feel free to contact me at [email protected]