Minor fixes in README.md (facebookincubator#903)

Summary: Pull Request resolved: facebookincubator#903 ATT Reviewed By: kadeng Differential Revision: D48425107 fbshipit-source-id: 5be66fb23b925c98e734f46a282599459edaafa0
henrylhtsang · Aug 18, 2023 · 34340fb · 34340fb
1 parent 29d0f88
commit 34340fb
Showing 1 changed file with 11 additions and 2 deletions.
diff --git a/README.md b/README.md
@@ -9,6 +9,7 @@
 AITemplate (AIT) is a Python framework that transforms deep neural networks into CUDA (NVIDIA GPU) / HIP (AMD GPU) C++ code for lightning-fast inference serving. AITemplate highlights include:
 
 - High performance: close to roofline fp16 TensorCore (NVIDIA GPU) / MatrixCore (AMD GPU) performance on major models, including ResNet, MaskRCNN, BERT, VisionTransformer, Stable Diffusion, etc.
+
 - Unified, open, and flexible. Seamless fp16 deep neural network models for NVIDIA GPU or AMD GPU. Fully open source, Lego-style easily extendable high-performance primitives for new model support. Supports a significantly more comprehensive range of fusions than existing solutions for both GPU platforms.
 
 
@@ -54,6 +55,7 @@ More info can be found from https://github.com/facebookincubator/AITemplate/tree
 ## Installation
 
 **Hardware requirements:**
+
   - **NVIDIA**: AIT is only tested on SM80+ GPUs (Ampere etc). Not all kernels work with old SM75/SM70 (T4/V100) GPUs.
   - **AMD**:  AIT is only tested on CDNA2 (MI-210/250) GPUs. There may be compiler issues for old CDNA1 (MI-100) GPUs.
 
@@ -67,6 +69,7 @@ git clone --recursive https://github.com/facebookincubator/AITemplate
 ### Docker Image
 
 We highly recommend using AITemplate with Docker to avoid accidentally using a wrong version of NVCC or HIPCC.
+
 - CUDA: `./docker/build.sh cuda`
 - ROCM: `DOCKER_BUILDKIT=1 ./docker/build.sh rocm`
 
@@ -75,6 +78,7 @@ This will build a docker image with tag `ait:latest`.
 ### From Source
 
 The following command will create a Python wheel for AITemplate. Please ensure you have correct CUDA/ROCm compiler installed.
+
 - CUDA: CUDA 11.6
 - ROCm: We tested on ROCm 5.2.3 with a customized build HIPCC with the command in docker/Dockerfile.rocm#L87-L96
 
@@ -109,37 +113,42 @@ AITemplate provides the following model templates & reference performance data o
 - [04_Vision Transformer](examples/04_vit/) with PyTorch Image Models (TIMM)
 - [05_Stable Diffusion](examples/05_stable_diffusion/) with Hugging Face Diffusers
 
+
 ## Release
 
 All current development updates can be seen in the AITemplate repository. Releases are not on a set schedule and will only be tagged for significant feature releases.
 
 Mid-term plan:
+
 - Better dynamic shape support: Focus on the dynamic sequence in Transformers. Add symbolic shape support.
 - More automatic graph passes: Relief manual rewrite models to obtain the best performance.
 - Quantization: fp8/int8/int4.
 - Sparsity pruning for Gemm.
 - PT2 integration: Aten2AIT is under active development.
 
 Long-term plan:
+
 - Automatic ONNX, Open-XLA and other format model conversion.
 - Composable Kernel CPU extension on AVX2/AVX-512 for AMD Epyc CPU.
 
+
 ## Contributing
 
 Check our [contributing guide](CONTRIBUTING.md) to learn about how to contribute to the project.
 
+
 ## The Team
 
 AITemplate is currently maintained by Meta engineers: [Ying Zhang](https://github.com/ipiszy), [Yang Chen](https://github.com/chenyang78), [Terry Chen](https://github.com/terrychenism), [Mu-Chu Lee](https://github.com/muchulee8), [Max Podkorytov](https://github.com/tenpercent), [Adnan Akhundov](https://github.com/aakhundov).
 
-AITemplate is co-created by Meta engineers: [Bing Xu](https://github.com/antinucleon), [Ying Zhang](https://github.com/ipiszy), [Hao Lu](https://github.com/hlu1), [Yang Chen](https://github.com/chenyang78), and [Terry Chen](https://github.com/terrychenism), with major contributions coming from more talented engineers. A non-exhaustive list to mention is Mike Iovine, Mu-Chu Lee, Scott Wolchok, Oleg Khabinov, Shirong Wu, Huaming Li, Hui Guo, Zhijing Li, Max Podkorytov. We also want to thank Andrew Tulloch, Yinghai Lu, Lu Fang for the valuable discussions.
+AITemplate is co-created by Meta engineers: [Bing Xu](https://github.com/antinucleon), [Ying Zhang](https://github.com/ipiszy), [Hao Lu](https://github.com/hlu1), [Yang Chen](https://github.com/chenyang78), and [Terry Chen](https://github.com/terrychenism), with major contributions coming from other talented engineers. A non-exhaustive list to mention is Mike Iovine, Mu-Chu Lee, Scott Wolchok, Oleg Khabinov, Shirong Wu, Huamin Li, Hui Guo, Zhijing Li, Max Podkorytov. We also want to thank Andrew Tulloch, Yinghai Lu, Lu Fang for the valuable discussions.
 
 FX2AIT and Aten2AIT are co-created and maintained by Meta engineers: [Wei Wei](https://github.com/frank-wei), [Shirong Wu](https://github.com/wushirong) and [Zhijing Li](https://github.com/tissue3).
 
 
 ## Acknowledgements
 
-AITemplate team works deeply with NVIDIA [CUTLASS](https://github.com/NVIDIA/cutlass) Team (led by Andrew Kerr, Haicheng Wu) and AMD [Composable Kernel](https://github.com/ROCmSoftwarePlatform/composable_kernel) Team (led by Chao Liu, Jing Zhang). We co-designed many advanced GPU optimizations specialized for each platform, and nothing is possible without our close collaboration.
+AITemplate team works closely with NVIDIA [CUTLASS](https://github.com/NVIDIA/cutlass) Team (led by Andrew Kerr, Haicheng Wu) and AMD [Composable Kernel](https://github.com/ROCmSoftwarePlatform/composable_kernel) Team (led by Chao Liu, Jing Zhang). We co-designed many advanced GPU optimizations specialized for each platform, and nothing is possible without our close collaboration.
 
 
 ## License