Skip to content

Releases: ROCm/rocFFT

rocFFT-1.0.10 for ROCm 4.1.0

23 Mar 01:18
c3110db
Compare
Choose a tag to compare

Added

  • Explicitly specify MAX_THREADS_PER_BLOCK through _launch_bounds for all kernels.
  • Switch to new syntax for specifying AMD GPU architecture names and features.

Optimizations

  • Optimized C2C/R2C 3D 64, 81, 100, 128, 200, 256 cube sizes.
  • Improved performance of the standalone out-of-place transpose kernel.
  • Optimized 1D length 40000 C2C case.
  • Enabled radix-7 for size 336.
  • New radix-11 and radix-13 kernels; used in length 11 and 13 (and some of their multiples) transforms.

Changed

  • rocFFT now automatically allocates a work buffer if the plan requires one but none is provided.
  • An explicit rocfft_status_invalid_work_buffer error is now returned when a work buffer of insufficient size is provided.
  • Updated online documentation.
  • Updated debian package name version with separated '_'.
  • Adjusted accuracy test tolerances and how they are compared.

Fixed

  • Fixed 4x4x8192 accuracy failure.

Known Issues

  • None

rocFFT-1.0.8 for ROCm 4.0.0

18 Dec 15:23
2d35fd6
Compare
Choose a tag to compare

New Features

  • No new features

Known Issues

  • None

rocFFT-1.0.8 for ROCm 3.10.0

30 Nov 17:02
2d35fd6
Compare
Choose a tag to compare

New Features

  • Added deprecation warning for hipfft.h
  • Optimized case 1D 10000 C2C
  • Fixed SBCC/SBRC non-unit stride batch issue
  • Updated README and added BUILD_CLIENTS_ALL
  • Improved test infrastructure

Known Issues

  • None

rocFFT-1.0.7 for ROCm 3.9.0

27 Oct 20:13
e97ebb4
Compare
Choose a tag to compare

New Features

  • Optimized 3D C2C cases
  • Optimized large pow-of-2 cases
  • Fused R2C/C2R even length post/pre-process with transpose kernel
  • Enabled 2D single kernel for pow3 and mixed pow2/pow3
  • Added radix-7 optimazation
  • Added 1D batch-paired R2C transform
  • Added -mno-xnack -mno-sram-ecc build flags
  • Fixed repo singleton destrory issue
  • Fixed 1D non-unit stride issue

Known Issues

  • None

rocFFT-1.0.6 for ROCm 3.8.0

21 Sep 18:51
0f7e9ba
Compare
Choose a tag to compare

New Features

  • optimized 1D C2C sizes: 6561, 10000
  • optimized 2D pow-of-2 small cases
  • optimized 2D rectangular cases: 256x264, 256x272, … 256x320, 256x4096, 256x8192
  • fixed buffer assigment bug of large 1D cases* improved test and sample code

Known Issues

  • None

rocFFT-1.0.5 for ROCm 3.7.0

15 Aug 04:26
35f7181
Compare
Choose a tag to compare

New Features

  • Optimized 2D C2C middle sizes with fused 2 kernels for pow-of-2
  • Change package dependency to hip-rocclr
  • Fixed build issue with C++ 17
  • Improved test infrastructure

Known Issues

  • None

rocFFT-1.0.4 for ROCm 3.6.0

11 Jul 00:38
8f98804
Compare
Choose a tag to compare

New Features

  • Fixed non-unit stride issue of 1D middle size
  • Updated client package installation path
  • Improved internal device memory usage check
  • Improved log
  • Improved test infrastructure

Known Issues

  • None

rocFFT-1.0.3 for ROCm 3.5.0

01 Jun 19:27
da61945
Compare
Choose a tag to compare

New Features

  • Switched to hip-clang as default compliler and deprecated hcc build
  • Improved hip-nvcc build
  • Support static lib build
  • Improved tests

Known Issues

None

rocFFT-1.0.3 for ROCm 3.5.0

01 Jun 19:52
da61945
Compare
Choose a tag to compare

New Features

  • Switched to hip-clang as default compliler and deprecated hcc build
  • Improved hip-nvcc build
  • Support static lib build
  • Improved tests

Known Issues

None

rocFFT-1.0.2 release for ROCM-3.3

30 Mar 16:09
Compare
Choose a tag to compare
  • Supported planar(split) format
  • Updated build instructions in README
  • Removing ROCm libs from dependency installation and updates to RUNPATH
  • Improved accuracy, rider, sample, CI tests, and added dyna-rider test