Releases: ROCm/rocFFT
Releases · ROCm/rocFFT
rocFFT-1.0.10 for ROCm 4.1.0
Added
- Explicitly specify MAX_THREADS_PER_BLOCK through _launch_bounds for all kernels.
- Switch to new syntax for specifying AMD GPU architecture names and features.
Optimizations
- Optimized C2C/R2C 3D 64, 81, 100, 128, 200, 256 cube sizes.
- Improved performance of the standalone out-of-place transpose kernel.
- Optimized 1D length 40000 C2C case.
- Enabled radix-7 for size 336.
- New radix-11 and radix-13 kernels; used in length 11 and 13 (and some of their multiples) transforms.
Changed
- rocFFT now automatically allocates a work buffer if the plan requires one but none is provided.
- An explicit rocfft_status_invalid_work_buffer error is now returned when a work buffer of insufficient size is provided.
- Updated online documentation.
- Updated debian package name version with separated '_'.
- Adjusted accuracy test tolerances and how they are compared.
Fixed
- Fixed 4x4x8192 accuracy failure.
Known Issues
- None
rocFFT-1.0.8 for ROCm 4.0.0
New Features
- No new features
Known Issues
- None
rocFFT-1.0.8 for ROCm 3.10.0
New Features
- Added deprecation warning for hipfft.h
- Optimized case 1D 10000 C2C
- Fixed SBCC/SBRC non-unit stride batch issue
- Updated README and added BUILD_CLIENTS_ALL
- Improved test infrastructure
Known Issues
- None
rocFFT-1.0.7 for ROCm 3.9.0
New Features
- Optimized 3D C2C cases
- Optimized large pow-of-2 cases
- Fused R2C/C2R even length post/pre-process with transpose kernel
- Enabled 2D single kernel for pow3 and mixed pow2/pow3
- Added radix-7 optimazation
- Added 1D batch-paired R2C transform
- Added -mno-xnack -mno-sram-ecc build flags
- Fixed repo singleton destrory issue
- Fixed 1D non-unit stride issue
Known Issues
- None
rocFFT-1.0.6 for ROCm 3.8.0
New Features
- optimized 1D C2C sizes: 6561, 10000
- optimized 2D pow-of-2 small cases
- optimized 2D rectangular cases: 256x264, 256x272, … 256x320, 256x4096, 256x8192
- fixed buffer assigment bug of large 1D cases* improved test and sample code
Known Issues
- None
rocFFT-1.0.5 for ROCm 3.7.0
New Features
- Optimized 2D C2C middle sizes with fused 2 kernels for pow-of-2
- Change package dependency to hip-rocclr
- Fixed build issue with C++ 17
- Improved test infrastructure
Known Issues
- None
rocFFT-1.0.4 for ROCm 3.6.0
New Features
- Fixed non-unit stride issue of 1D middle size
- Updated client package installation path
- Improved internal device memory usage check
- Improved log
- Improved test infrastructure
Known Issues
- None
rocFFT-1.0.3 for ROCm 3.5.0
New Features
- Switched to hip-clang as default compliler and deprecated hcc build
- Improved hip-nvcc build
- Support static lib build
- Improved tests
Known Issues
None
rocFFT-1.0.3 for ROCm 3.5.0
New Features
- Switched to hip-clang as default compliler and deprecated hcc build
- Improved hip-nvcc build
- Support static lib build
- Improved tests
Known Issues
None
rocFFT-1.0.2 release for ROCM-3.3
- Supported planar(split) format
- Updated build instructions in README
- Removing ROCm libs from dependency installation and updates to RUNPATH
- Improved accuracy, rider, sample, CI tests, and added dyna-rider test