Skip to content

v2.4.0

Latest
Compare
Choose a tag to compare
@mattmartineau mattmartineau released this 25 Oct 11:58
· 33 commits to main since this release
2b4762f

Changes:

  • Increased maximum CUDA version to 12.2, and now supporting HPC SDK 23.7
  • Fixed issue preventing parameters from being updated after config initialisation
  • Restructured all source files (now under src) and removed plugin feature
  • Replaced custom memory pool with cudaMallocAsync when defining USE_CUDAMALLOCASYNC
  • Changed the cuSPARSE SpMV algorithm choice to CUSPARSE_CSRMV_ALG1, which should improve solve performance for recent versions of cuSPARSE
  • Added single-kernel csrmv that is invoked when total number of rows in the local matrix falls below 3 times the number of SMs on the target GPUs
  • Changes to thrust
    - Increased thrust version to 2.1.0
    - Added specific tested version of thrust as a submodule, please use git clone --recursive to pull AmgX from v2.4.0 onwards
    - Wrapped thrust in namespace to avoid shared library sharing issues referenced here https://github.com/NVIDIA/thrust/releases/tag/1.14.0
    - Removed many superfluous points of synchronisation introduced by thrust
  • Improved performance of writing matrices to file
  • Improved Clang compatibility
  • Add a divergence check, providing new config parameter rel_div_tolerance
  • Added compile-time definition to avoid exception handling, in order to improve experience when debugging (DISABLE_EXCEPTION_HANDLING)
  • Fixed multiple synchronisation issues that can show up on newer GPU architectures (sm_70+)
  • Fixed partition reordering for block_sizes > 1
  • Fixed build issue that arose when AmgX is built as a subproject
  • Fixed issue with OpenMP and NO_MPI linking
  • Replaced some inline asm with intrinsics
  • Fixed issue with exact_coarse_solve grid sizing
  • Fixed issue with use_sum_stopping_criteria
  • Fixed SIGFPE that could occur when the initial norm is 0
  • Added a new API call AMGX_matrix_check_symmetry, that tests if a matrix is structurally or completely symmetric

Tested configurations:

Linux x86-64:
-- Ubuntu 20.04, Ubuntu 22.04
-- NVHPC 23.7, GCC 9.4.0, GCC 12.1
-- OpenMPI 4.0.x
-- CUDA 11.2, 11.8, 12.2
-- A100, H100

Note that while AMGX has support for building in Windows, testing on Windows is very limited.