Skip to content

Releases: ROCm/rocFFT

rocFFT 1.0.20 for ROCm 5.4.1

15 Dec 18:40
9961827
Compare
Choose a tag to compare

Fixed

  • Fixed incorrect results on strided large 1D FFTs where batch size does not equal the stride.

rocFFT 1.0.19 for ROCm 5.4.0

30 Nov 17:38
6005bfa
Compare
Choose a tag to compare

Optimizations

  • Optimized some strided large 1D plans.

Added

  • Added rocfft_plan_description_set_scale_factor API to efficiently multiply each output element of a FFT by a given scaling factor.
  • Created a rocfft_kernel_cache.db file next to the installed library. SBCC kernels are moved to this file when built with the library, and are runtime-compiled for new GPU architectures.
  • Added gfx1100 and gfx1102 to default AMDGPU_TARGETS.

Changed

  • Moved runtime compilation cache to in-memory by default. A default on-disk cache can encounter contention problems
    on multi-node clusters with a shared filesystem. rocFFT can still be told to use an on-disk cache by setting the
    ROCFFT_RTC_CACHE_PATH environment variable.

rocFFT 1.0.18 for ROCm 5.3.3

17 Nov 19:21
11c649a
Compare
Choose a tag to compare

rocFFT code for ROCm 5.3.3 did not change. The library was rebuilt for the updated ROCm 5.3.3 stack.

rocFFT 1.0.18 for ROCm 5.3.2

10 Nov 01:07
11c649a
Compare
Choose a tag to compare

rocFFT code for ROCm 5.3.2 did not change. The library was rebuilt for the updated ROCm 5.3.2 stack.

rocFFT 1.0.18 for ROCm 5.3.1

28 Oct 16:59
11c649a
Compare
Choose a tag to compare

rocFFT code for ROCm 5.3.1 did not change. The library was rebuilt for the updated ROCm 5.3.1 stack.

rocFFT 1.0.18 for ROCm 5.3.0

30 Sep 19:25
11c649a
Compare
Choose a tag to compare

Changed

  • Runtime compilation cache now looks for environment variables XDG_CACHE_HOME (on Linux) and LOCALAPPDATA (on Windows) before falling back to HOME.

Optimizations

  • Optimized 2D R2C/C2R to use 2-kernel plans where possible.
  • Improved performance of the Bluestein algorithm.
  • Optimized sbcc-168 and 100 by using half-lds.

Fixed

  • Fixed occasional failures to parallelize runtime compilation of kernels.
    Failures would be retried serially and ultimately succeed, but this would take extra time.
  • Fixed failures of some R2C 3D transforms that use the unsupported TILE_UNALGNED SBRC kernels.
    An example is 98^3 R2C out-of-place.
  • Fixed bugs in SBRC_ERC type.

rocFFT 1.0.17 for ROCm 5.2.3

18 Aug 16:59
Compare
Choose a tag to compare

rocFFT code for ROCm 5.2.3 did not change. The library was rebuilt for the updated ROCm 5.2.3 stack.

rocFFT 1.0.17 for ROCm 5.2.1

21 Jul 20:24
Compare
Choose a tag to compare

rocFFT code for ROCm 5.2.1 did not change. The library was rebuilt for the updated ROCm 5.2.1 stack.

rocFFT 1.0.17 for ROCm 5.2.0

28 Jun 18:45
Compare
Choose a tag to compare

Added

  • Packages for test and benchmark executables on all supported OSes using CPack.
  • Added File/Folder Reorg Changes with backward compatibility support using ROCM-CMAKE wrapper functions.

Changed

  • Improved reuse of twiddle memory between plans.
  • Set a default load/store callback when only one callback
    type is set via the API for improved performance.

Optimizations

  • Introduced a new access pattern of lds (non-linear) and applied it on
    sbcc kernels len 64 to get performance improvement.

Fixed

  • Fixed plan creation failure in cases where SBCC kernels would need to write to non-unit-stride buffers.

rocFFT 1.0.16 for ROCm 5.1.3

20 May 17:05
15ac7c4
Compare
Choose a tag to compare

rocFFT code for ROCm 5.1.3 did not change. The library was rebuilt for the updated ROCm 5.1.3 stack.