Releases: ROCm/rccl
Releases · ROCm/rccl
rccl 2.11.4 for ROCm 5.1.3
RCCL code for ROCm 5.1.3 did not change. The library was rebuilt for the updated ROCm 5.1.3 stack.
rccl 2.11.4 for ROCm 5.1.1
RCCL code for ROCm 5.1.1 did not change. The library was rebuilt for the updated ROCm 5.1.1 stack.
RCCL 2.11.4 for ROCm 5.1.0
Added
- Compatibility with NCCL 2.11.4
Known issues
- Managed memory is not currently supported for clique-based kernels
rccl-2.10.3 for ROCm 5.0.2
rccl code for ROCm 5.0.2 is unchanged from rccl for ROCm 5.0.1. The library was rebuilt for the updated ROCm 5.0.2 stack.
RCCL-2.10.3 for ROCm 5.0.1
rccl code for ROCm 5.0.1 is unchanged from rccl for ROCm 5.0.0. The library was rebuilt for the updated ROCm 5.0.1 stack.
RCCL-2.10.3 for ROCm 5.0.0
Added
- Compatibility with NCCL 2.10.3
Known issues
- Managed memory is not currently supported for clique-based kernels
RCCL-2.9.9 for ROCm 4.5.2
rccl code for ROCm 4.5.2 is unchanged from rccl for ROCm 4.5.0. The library was rebuilt for the updated ROCm 4.5.2 stack.
RCCL-2.9.9 for ROCm 4.5.0
Changed
- Packaging split into a runtime package called rccl and a development package called rccl-devel. The development package depends on runtime. The runtime package suggests the development package for all supported OSes except CentOS 7 to aid in the transition. The suggests feature in packaging is introduced as a deprecated feature and will be removed in a future rocm release.
Added
- Compatibility with NCCL 2.9.9
Known issues
- Managed memory is not currently supported for clique-based kernels
RCCL-2.8.4 for ROCm 4.3.1
Added
- Add NPS=4 model
- Sort IB devices by device name
RCCL-2.8.4 for ROCm 4.3.0
Added
- Ability to select the number of channels to use for clique-based all reduce (RCCL_CLIQUE_ALLREDUCE_NCHANNELS). This can be adjusted to tune for performance when computation kernels are being executed in parallel.
Optimizations
- Additional tuning for clique-based kernel AllReduce performance (still requires opt in with RCCL_ENABLE_CLIQUE=1)
- Modification of default values for number of channels / byte limits for clique-based all reduce based on device architecture
Changed
- Replaced RCCL_FORCE_ENABLE_CLIQUE to RCCL_CLIQUE_IGNORE_TOPO
- Clique-based kernels can now be enabled on topologies where all active GPUs are XGMI-connected
- Topologies not normally supported by clique-based kernels require RCCL_CLIQUE_IGNORE_TOPO=1
Fixed
- Install script '-r' flag invoked alone no longer incorrectly deletes any existing builds.
Known issues
- Managed memory is not currently supported for clique-based kernels