[FP8] Relax the FP8 cuda arch limitation for NVIDIA Ada GPUs (SM89) #548

leeeizhang · 2024-08-21T08:35:09Z

Nvidia Ada Lovelace GPUs (e.g., RTX 4090, L20, L40) with SM89 version are also support FP8 MMA, and hence, it is recommended to relax the CUDA architecture limitations to enable FP8 training on a broader range of devices.

and the CUDA 12.0 announcement says that it supports Lovelace architecture:

CUDA 12.0 exposes programmable functionality for many features of the NVIDIA Hopper and NVIDIA Ada Lovelace architectures: ...32x Ultra xMMA (including FP8 and FP16)

leeeizhang mentioned this issue Aug 21, 2024

[MRG] relax the FP8 CUDA arch limitation to SM89 #549

Merged

awgu closed this as completed in #549 Aug 21, 2024

awgu closed this as completed in 90c889e Aug 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FP8] Relax the FP8 cuda arch limitation for NVIDIA Ada GPUs (SM89) #548

[FP8] Relax the FP8 cuda arch limitation for NVIDIA Ada GPUs (SM89) #548

leeeizhang commented Aug 21, 2024 •

edited

Loading

[FP8] Relax the FP8 cuda arch limitation for NVIDIA Ada GPUs (SM89) #548

[FP8] Relax the FP8 cuda arch limitation for NVIDIA Ada GPUs (SM89) #548

Comments

leeeizhang commented Aug 21, 2024 • edited Loading

leeeizhang commented Aug 21, 2024 •

edited

Loading