Releases: ROCm/rccl
Releases · ROCm/rccl
rccl 2.13.4 for ROCm 5.4.1
RCCL code for ROCm 5.4.1 did not change. The library was rebuilt for the updated ROCm 5.4.1 stack.
RCCL 2.13.4 for ROCm 5.4.0
Changed
- Compatibility with NCCL 2.13.4
- Improvements to RCCL when running with hipGraphs
- RCCL_ENABLE_HIPGRAPH environment variable is no longer necessary to enable hipGraph support
- Minor latency improvements
Fixed
- Resolved potential memory access error due to asynchronous memset
rccl 2.12.10 for ROCm 5.3.3
RCCL code for ROCm 5.3.3 did not change. The library was rebuilt for the updated ROCm 5.3.3 stack.
rccl 2.12.10 for ROCm 5.3.2
RCCL code for ROCm 5.3.2 did not change. The library was rebuilt for the updated ROCm 5.3.2 stack.
rccl 2.12.10 for ROCm 5.3.1
RCCL code for ROCm 5.3.1 did not change. The library was rebuilt for the updated ROCm 5.3.1 stack.
RCCL 2.12.10 for ROCm 5.3.0
Changed
- Improvements to LL128 algorithms
Added
- Adding initial hipGraph support via opt-in environment variable RCCL_ENABLE_HIPGRAPH
- Integrating with NPKit (https://github.com/microsoft/NPKit) profiling code
RCCL 2.12.10 for ROCm 5.2.3
Added
- Compatibility with NCCL 2.12.10
- Packages for test and benchmark executables on all supported OSes using CPack.
- Adding custom signal handler - opt-in with RCCL_ENABLE_SIGNALHANDLER=1
- Additional details provided if Binary File Descriptor library (BFD) is pre-installed
- Adding support for reusing ports in NET/IB channels
- Opt-in with NCCL_IB_SOCK_CLIENT_PORT_REUSE=1 and NCCL_IB_SOCK_SERVER_PORT_REUSE=1
- When "Call to bind failed : Address already in use" error happens in large-scale AlltoAll
(e.g., >=64 MI200 nodes), users are suggested to opt-in either one or both of the options
to resolve the massive port usage issue - Avoid using NCCL_IB_SOCK_SERVER_PORT_REUSE when NCCL_NCHANNELS_PER_NET_PEER is tuned >1
Removed
- Removed experimental clique-based kernels
rccl 2.11.4 for ROCm 5.2.1
RCCL code for ROCm 5.2.1 did not change. The library was rebuilt for the updated ROCm 5.2.1 stack.
RCCL 2.11.4 for ROCm 5.2.0
Changed
- Unit testing framework rework
- Minor bug fixes
Known issues
- Managed memory is not currently supported for clique-based kernels
rccl 2.11.4 for ROCm 5.1.3
RCCL code for ROCm 5.1.3 did not change. The library was rebuilt for the updated ROCm 5.1.3 stack.