Releases: rapidsai/rmm
Releases · rapidsai/rmm
v24.12.00
🚨 Breaking Changes
🐛 Bug Fixes
- Query total memory in failure_callback_resource_adaptor tests (#1734) @harrism
- Treat deprecation warnings as errors and fix deprecation warnings in replay benchmark (#1728) @harrism
- Disallow cuda-python 12.6.1 and 11.8.4 (#1720) @bdice
- Fix typos in .gitignore (#1697) @charlesbluca
- Fix
rmm ._lib
imports (#1693) @Matt711
📖 Documentation
🚀 New Features
- Correct rmm tests for validity of device pointers (#1714) @robertmaynard
- Update rmm tests to use rapids_cmake_support_conda_env (#1707) @robertmaynard
- adding telemetry (#1692) @msarahan
- Make
cudaMallocAsync
logic non-optional as we require CUDA 11.2+ (#1667) @robertmaynard
🛠️ Improvements
- enforce wheel size limits, README formatting in CI (#1726) @jameslamb
- Remove all explicit usage of fmtlib (#1724) @harrism
- WIP: put a ceiling on cuda-python (#1723) @jameslamb
- use rapids-generate-pip-constraints to pin to oldest dependencies in CI (#1716) @jameslamb
- Deprecate
rmm._lib
(#1713) @Matt711 - print sccache stats in builds (#1712) @jameslamb
- [fea] Expose the arena mr to the Python interface. (#1711) @trivialfis
- devcontainer: replace
VAULT_HOST
withAWS_ROLE_ARN
(#1708) @jjacobelli - make conda installs in CI stricter (part 2) (#1703) @jameslamb
- Add BUILD_SHARED_LIBS option defaulting to ON (#1702) @wence-
- make conda installs in CI stricter (#1696) @jameslamb
- Prune workflows based on changed files (#1695) @KyleFromNVIDIA
- Deprecate support for directly accessing logger (#1690) @vyasr
- Use
rmm::percent_of_free_device_memory
in arena test (#1689) @wence- - exclude 'gcovr' from list of development pip packages (#1688) @jameslamb
- [Improvement] Reorganize Cython to separate C++ bindings and make Cython classes public (#1676) @Matt711
[NIGHTLY] v25.02.00
🔗 Links
🚨 Breaking Changes
- Switch to using separate rapids-logger repo (#1774) @vyasr
- Remove deprecated factory functions from resource adaptors. (#1767) @bdice
- Remove
rmm._lib
(#1765) @Matt711 - Remove memory access flags from cuda_async_memory_resource (#1754) @abellina
- Create logger wrapper around spdlog that can be easily reused in other libraries (#1722) @vyasr
🐛 Bug Fixes
- Add missing array header include (#1771) @robertmaynard
- Remove memory access flags from cuda_async_memory_resource (#1754) @abellina
- Update build.sh (#1749) @vyasr
- Fix some logger issues (#1739) @vyasr
- Use consistent signature for target_link_libraries (#1738) @vyasr
📖 Documentation
🚀 New Features
- Remove deprecated factory functions from resource adaptors. (#1767) @bdice
- Remove
rmm._lib
(#1765) @Matt711 - Reduce dependencies on numba. (#1761) @bdice
- Use ruff, remove isort and black. (#1759) @bdice
- Use bindings layout for all cuda-python imports. (#1756) @bdice
- Add configuration for pre-commit.ci, update pre-commit hooks (#1746) @bdice
- Adds fabric handle and memory protection flags to cuda_async_memory_resource (#1743) @abellina
- Remove upper bounds on cuda-python to allow 12.6.2 and 11.8.5 (#1729) @bdice
🛠️ Improvements
- [pre-commit.ci] pre-commit autoupdate (#1778) @pre-commit-ci[bot]
- Use rapids-cmake for the logger (#1776) @vyasr
- Switch to using separate rapids-logger repo (#1774) @vyasr
- Check if nightlies have succeeded recently enough (#1772) @vyasr
- Fix codespell behavior. (#1769) @bdice
- Remove ignored cuda-python deprecation warning. (#1768) @bdice
- Forward-merge branch-24.12 to branch-25.02 (#1766) @bdice
- Update version references in workflow (#1757) @AyodeAwe
- gate telemetry dispatch calls on TELEMETRY_ENABLED env var (#1752) @msarahan
- Update cuda-python lower bounds to 12.6.2 / 11.8.5 (#1751) @bdice
- remove certs and simplify telemetry summarize (#1750) @msarahan
- stop installing 'wheel' in wheel-building script (#1748) @jameslamb
- Require approval to run CI on draft PRs (#1737) @bdice
- Create logger wrapper around spdlog that can be easily reused in other libraries (#1722) @vyasr
- Add breaking change workflow trigger (#1719) @AyodeAwe
v24.10.00
🚨 Breaking Changes
- Inline functions that return static references must have default visibility (#1653) @wence-
- Hide visibility of non-public symbols (#1644) @jameslamb
- Deprecate adaptor factories. (#1626) @bdice
🐛 Bug Fixes
- Add missing include to
resource_ref.hpp
(#1677) @miscco - Remove the friend declaration with an attribute (#1669) @kingcrimsontianyu
- Fix
build.sh clean
to delete python build directory (#1658) @rongou - Stream synchronize before deallocating SAM (#1655) @rongou
- Explicitly mark RMM headers with
RMM_EXPORT
(#1654) @robertmaynard - Inline functions that return static references must have default visibility (#1653) @wence-
- Use
tool.scikit-build.cmake.version
(#1637) @KyleFromNVIDIA
📖 Documentation
- Recommend
miniforge
for conda install. (#1681) @bdice - Fix docs cross reference in DeviceBuffer.prefetch (#1636) @bdice
🚀 New Features
- [FEA] Allow setting
*_pool_size
with human-readable string (#1670) @Matt711 - Update RMM adaptors, containers and tests to use get/set_current_device_resource_ref() (#1661) @harrism
- Deprecate adaptor factories. (#1626) @bdice
- Allow testing of earliest/latest dependencies (#1613) @seberg
- Add resource_ref versions of get/set_current_device_resource (#1598) @harrism
🛠️ Improvements
- Update update-version.sh to use packaging lib (#1685) @AyodeAwe
- Use CI workflow branch 'branch-24.10' again (#1683) @jameslamb
- Update fmt (to 11.0.2) and spdlog (to 1.14.1). (#1678) @jameslamb
- Attempt to address oom failures in test suite (#1672) @wence-
- Add support for Python 3.12 (#1666) @jameslamb
- Update rapidsai/pre-commit-hooks (#1663) @KyleFromNVIDIA
- Drop Python 3.9 support (#1659) @jameslamb
- Remove NumPy <2 pin (#1650) @seberg
- Hide visibility of non-public symbols (#1644) @jameslamb
- Update pre-commit hooks (#1643) @KyleFromNVIDIA
- Improve update-version.sh (#1640) @bdice
- Install headers into
${CMAKE_INSTALL_INCLUDEDIR}
(#1633) @KyleFromNVIDIA - Merge branch-24.08 into branch-24.10 (#1631) @jameslamb
v24.08.00
🚨 Breaking Changes
🐛 Bug Fixes
- Rename
.devcontainer
s for CUDA 12.5 (#1615) @jakirkham - Avoid accessing statistics_resource_adaptor stack top if it is empty (#1588) @harrism
- Avoid
--find-links
. (#1583) @bdice - Fix test_python matrix (#1579) @KyleFromNVIDIA
- Allow anonymous user in devcontainer name (#1576) @bdice
📖 Documentation
- Instruct to create associated issue in PR template. (#1624) @harrism
- add rapids-build-backend to docs (#1614) @jameslamb
- Revert "Remove HTML builds of librmm (#1415)" (#1604) @bdice
- Add documentation for CPM usage (#1600) @pauleonix
- Update Thrust CMake Guide link in README.md (#1593) @pauleonix
🚀 New Features
- Prefetch resource adaptor (#1608) @bdice
- Add python wrapper for system memory resource (#1605) @rongou
- Refactor mr_ref_tests to not depend on MR base classes (#1589) @harrism
- Add system memory resource (#1581) @rongou
- Add rmm::prefetch() and DeviceBuffer.prefetch() (#1573) @harrism
🛠️ Improvements
- split up CUDA-suffixed dependencies in dependencies.yaml (#1627) @jameslamb
- Remove prefetch factory. (#1625) @bdice
- Use workflow branch 24.08 again (#1617) @KyleFromNVIDIA
- Build and test with CUDA 12.5.1 (#1607) @KyleFromNVIDIA
- skip CMake 3.30.0 (#1603) @jameslamb
- Add RMM_USE_NVTX cmake option to provide localized control of NVTX for RMM (#1602) @jlowe
- Use verify-alpha-spec hook (#1601) @KyleFromNVIDIA
- Avoid --find-links in wheel jobs (#1586) @jameslamb
- resolve dependency-file-generator warning, remove unnecessary rapids-build-backend configuration (#1582) @jameslamb
- Remove THRUST_WRAPPED_NAMESPACE and tests (#1578) @harrism
- Remove text builds of documentation (#1575) @vyasr
- ensure update-version.sh preserves alpha specs (#1572) @jameslamb
- Add
available_device_memory
to fetch free amount of memory on a GPU (#1567) @galipremsagar - Add a stack to the statistics resource (#1563) @madsbk
- Use rapids-build-backend. (#1502) @bdice
v24.06.00
🚨 Breaking Changes
- Refactor polymorphic allocator to use device_async_resource_ref (#1555) @harrism
- Remove deprecated functionality (#1537) @harrism
- Remove deprecated cuda_async_memory_resource constructor that takes thrust::optional parameters (#1535) @harrism
- Remove deprecated supports_streams and get_mem_info methods. (#1519) @harrism
🐛 Bug Fixes
- rmm needs to link to nvtx3::nvtx3-cpp to support installed nvtx3 (#1569) @robertmaynard
- Make sure rmm wheel dependency on librmm is updated [skip ci] (#1565) @raydouglass
- Don't ignore GCC-specific warning under Clang (#1557) @aaronmondal
- Add publish jobs for C++ wheels (#1554) @vyasr
- Explicitly use the current device resource in DeviceBuffer (#1514) @wence-
📖 Documentation
- Allow specifying mr in DeviceBuffer construction, and document ownership requirements in Python/C++ interfacing (#1552) @wence-
- Fix Python install instruction (#1547) @wence-
- Update multi-gpu discussion for device_buffer and device_vector dtors (#1524) @wence-
- Fix ordering / heading levels in README.md and python example in guide.md (#1513) @harrism
🚀 New Features
- Add NVTX support and RMM_FUNC_RANGE() macro (#1558) @harrism
- Always use a static gtest (#1532) @robertmaynard
- Build C++ wheel (#1529) @vyasr
- Remove deprecated supports_streams and get_mem_info methods. (#1519) @harrism
🛠️ Improvements
- update copyright dates (#1564) @jameslamb
- Overhaul ops-codeowners (#1561) @raydouglass
- Adding support for cupy.cuda.stream.ExternalStream (#1559) @lilohuang
- Refactor polymorphic allocator to use device_async_resource_ref (#1555) @harrism
- add RAPIDS copyright pre-commit hook (#1553) @jameslamb
- Enable warnings as errors for Python tests (#1551) @mroeschke
- Remove header existence tests. (#1550) @bdice
- Only use functions in the limited API (#1545) @vyasr
- Migrate to
{{ stdlib("c") }}
(#1543) @hcho3 - Fix
cuda11.8
nvcc dependency (#1542) @trxcllnt - add --rm and --name to devcontainer run args (#1539) @trxcllnt
- Remove deprecated functionality (#1537) @harrism
- Remove deprecated cuda_async_memory_resource constructor that takes thrust::optional parameters (#1535) @harrism
- Make thrust_allocator deallocate safe in multi-device setting (#1533) @wence-
- Move rmm Python package to subdirectory (#1526) @vyasr
- Remove a file not being used (#1521) @galipremsagar
- Remove unneeded
update-version.sh
update (#1520) @AyodeAwe - Enable all tests for
arm
arch (#1510) @galipremsagar
v24.04.00
🚨 Breaking Changes
- Accept stream argument in DeviceMemoryResource allocate/deallocate (#1494) @wence-
- Replace all internal usage of
get_upstream
withget_upstream_resource
(#1491) @miscco - Deprecate rmm::mr::device_memory_resource::supports_streams() (#1452) @harrism
- Remove deprecated rmm::detail::available_device_memory (#1438) @harrism
- Make device_memory_resource::supports_streams() not pure virtual. Remove derived implementations and calls in RMM (#1437) @harrism
- Deprecate rmm::mr::device_memory_resource::get_mem_info() and supports_get_mem_info(). (#1436) @harrism
🐛 Bug Fixes
- Fix search path for torch allocator in editable installs and ensure CUDA support is available (#1498) @vyasr
- Accept stream argument in DeviceMemoryResource allocate/deallocate (#1494) @wence-
- Run STATISTICS_TEST and TRACKING_TEST in serial to avoid OOM errors. (#1487) @bdice
📖 Documentation
🚀 New Features
- Replace all internal usage of
get_upstream
withget_upstream_resource
(#1491) @miscco - Add complete set of resource ref aliases (#1479) @nvdbaranec
- Automate include grouping using clang-format (#1463) @harrism
- Add
get_upstream_resource
to resource adaptors (#1456) @miscco - Deprecate rmm::mr::device_memory_resource::supports_streams() (#1452) @harrism
- Remove duplicated memory_resource_tests (#1451) @miscco
- Change
rmm::exec_policy
to takeasync_resource_ref
(#1449) @miscco - Change
device_scalar
to takeasync_resource_ref
(#1447) @miscco - Add device_async_resource_ref convenience alias (#1441) @harrism
- Remove deprecated rmm::detail::available_device_memory (#1438) @harrism
- Make device_memory_resource::supports_streams() not pure virtual. Remove derived implementations and calls in RMM (#1437) @harrism
- Deprecate rmm::mr::device_memory_resource::get_mem_info() and supports_get_mem_info(). (#1436) @harrism
- Support CUDA 12.2 (#1419) @jameslamb
🛠️ Improvements
- Use
conda env create --yes
instead of--force
(#1509) @bdice - Add upper bound to prevent usage of NumPy 2 (#1501) @bdice
- Remove hard-coding of RAPIDS version where possible (#1496) @KyleFromNVIDIA
- Requre NumPy 1.23+ (#1488) @jakirkham
- Use
rmm::device_async_resource_ref
in multi_stream_allocation benchmark (#1482) @miscco - Update devcontainers to CUDA Toolkit 12.2 (#1470) @trxcllnt
- Add support for Python 3.11 (#1469) @jameslamb
- target branch-24.04 for GitHub Actions workflows (#1468) @jameslamb
- [FEA]: Use
std::optional
instead ofthrust::optional
(#1464) @miscco - Add environment-agnostic scripts for running ctests and pytests (#1462) @trxcllnt
- Ensure that
ctest
is called with--no-tests=error
. (#1460) @bdice - Update ops-bot.yaml (#1458) @AyodeAwe
- Adopt the
rmm::device_async_resource_ref
alias (#1454) @miscco - Refactor error.hpp out of detail (#1439) @lamarrr
v24.02.00
🚨 Breaking Changes
- Make device_memory_resource::do_get_mem_info() and supports_get_mem_info() not pure virtual. Remove derived implementations and calls in RMM (#1430) @harrism
- Deprecate detail::available_device_memory, most detail/aligned.hpp utilities, and optional pool_memory_resource initial size (#1424) @harrism
- Require explicit pool size in
pool_memory_resource
and move some things out of detail namespace (#1417) @harrism - Remove HTML builds of librmm (#1415) @vyasr
- Update to CCCL 2.2.0. (#1404) @bdice
- Switch to scikit-build-core (#1287) @vyasr
🐛 Bug Fixes
- Exclude tests from builds (#1459) @vyasr
- Update CODEOWNERS (#1410) @raydouglass
- Correct signatures for torch allocator plug in (#1407) @wence-
- Fix Arena MR to support simultaneous access by PTDS and other streams (#1395) @tgravescs
- Fix else-after-throw clang tidy error (#1391) @harrism
📖 Documentation
- remove references to setup.py in docs (#1420) @jameslamb
- Remove HTML builds of librmm (#1415) @vyasr
- Update GPU support docs to drop Pascal (#1413) @harrism
🚀 New Features
- Make device_memory_resource::do_get_mem_info() and supports_get_mem_info() not pure virtual. Remove derived implementations and calls in RMM (#1430) @harrism
- Deprecate detail::available_device_memory, most detail/aligned.hpp utilities, and optional pool_memory_resource initial size (#1424) @harrism
- Add a host-pinned memory resource that can be used as upstream for
pool_memory_resource
. (#1392) @harrism
🛠️ Improvements
- Remove usages of rapids-env-update (#1423) @KyleFromNVIDIA
- Refactor CUDA versions in dependencies.yaml. (#1422) @bdice
- Require explicit pool size in
pool_memory_resource
and move some things out of detail namespace (#1417) @harrism - Update dependencies.yaml to support CUDA 12.*. (#1414) @bdice
- Define python dependency range as a matrix fallback. (#1409) @bdice
- Use latest cuda-python within CUDA major version. (#1406) @bdice
- Update to CCCL 2.2.0. (#1404) @bdice
- Remove RMM_BUILD_WHEELS and standardize Python builds (#1401) @vyasr
- Update to fmt 10.1.1 and spdlog 1.12.0. (#1374) @bdice
- Switch to scikit-build-core (#1287) @vyasr
v23.12.00
🚨 Breaking Changes
- Document minimum CUDA version of 11.4 (#1385) @harrism
- Store and set the correct CUDA device in device_buffer (#1370) @harrism
- Use
cuda::mr::memory_resource
instead of rawdevice_memory_resource
(#1095) @miscco
🐛 Bug Fixes
- Update actions/labeler to v4 (#1397) @raydouglass
- Backport arena MR fix for simultaneous access by PTDS and other streams (#1396) @bdice
- Deliberately leak PTDS thread_local events in stream ordered mr (#1375) @wence-
- Add missing CUDA 12 dependencies and fix dlopen library names (#1366) @vyasr
📖 Documentation
- Document minimum CUDA version of 11.4 (#1385) @harrism
- Fix more doxygen issues (#1367) @vyasr
- Add groups to the doxygen docs (#1358) @vyasr
- Enable doxygen XML and fix issues (#1348) @vyasr
🚀 New Features
- Make internally stored default argument values public (#1373) @vyasr
- Store and set the correct CUDA device in device_buffer (#1370) @harrism
- Update rapids-cmake functions to non-deprecated signatures (#1357) @robertmaynard
- Generate unified Python/C++ docs (#1324) @vyasr
- Use
cuda::mr::memory_resource
instead of rawdevice_memory_resource
(#1095) @miscco
🛠️ Improvements
- Silence false gcc warning (#1381) @miscco
- Build concurrency for nightly and merge triggers (#1380) @bdice
- Update
shared-action-workflows
references (#1363) @AyodeAwe - Use branch-23.12 workflows. (#1360) @bdice
- Update devcontainers to 23.12 (#1355) @raydouglass
- Generate proper, consistent nightly versions for pip and conda packages (#1347) @vyasr
- RMM: Build CUDA 12.0 ARM conda packages. (#1330) @bdice
v23.10.00
🚨 Breaking Changes
🐛 Bug Fixes
- Compile cdef public functions from torch_allocator with C ABI (#1350) @wence-
- Make doxygen only a conda dependency. (#1344) @bdice
- Use
conda mambabuild
notmamba mambabuild
(#1338) @wence- - Fix stream_ordered_memory_resource attempt to record event in stream from another device (#1333) @harrism
📖 Documentation
- Clean up headers in CMakeLists.txt. (#1341) @bdice
- Add pre-commit hook to validate doxygen (#1334) @vyasr
- Fix doxygen warnings (#1317) @vyasr
- Treat warnings as errors in Python documentation (#1316) @vyasr
🚀 New Features
🛠️ Improvements
- Update image names (#1346) @AyodeAwe
- Update to clang 16.0.6. (#1343) @bdice
- Update doxygen to 1.9.1 (#1337) @vyasr
- Simplify wheel build scripts and allow alphas of RAPIDS dependencies (#1335) @divyegala
- Use
copy-pr-bot
(#1329) @ajschmidt8 - Add RMM devcontainers (#1328) @trxcllnt
- Add Python bindings for
limiting_resource_adaptor
(#1327) @pentschev - Fix missing jQuery error in docs (#1321) @AyodeAwe
- Use fetch_rapids.cmake. (#1319) @bdice
- Update to Cython 3.0.0 (#1313) @vyasr
- Branch 23.10 merge 23.08 (#1312) @vyasr
- Branch 23.10 merge 23.08 (#1309) @vyasr
v23.08.00
🚨 Breaking Changes
- Stop invoking setup.py (#1300) @vyasr
- Remove now-deprecated top-level allocator functions (#1281) @wence-
- Remove padding from device_memory_resource (#1278) @vyasr
🐛 Bug Fixes
- Fix typo in wheels-test.yaml. (#1310) @bdice
- Add a missing '#include <array>' in logger.hpp (#1295) @valgur
- Use gbench
thread_index()
accessor to fix replay bench compilation (#1293) @harrism - Ensure logger tests don't generate temp directories in build dir (#1289) @robertmaynard
🚀 New Features
🛠️ Improvements
- Switch to new CI wheel building pipeline (#1305) @vyasr
- Revert CUDA 12.0 CI workflows to branch-23.08. (#1303) @bdice
- Update linters: remove flake8, add ruff, update cython-lint (#1302) @vyasr
- Adding identify minimum version requirement (#1301) @hyperbolic2346
- Stop invoking setup.py (#1300) @vyasr
- Use cuda-version to constrain cudatoolkit. (#1296) @bdice
- Update to CMake 3.26.4 (#1291) @vyasr
- use rapids-upload-docs script (#1288) @AyodeAwe
- Reorder parameters in RMM_EXPECTS (#1286) @vyasr
- Remove documentation build scripts for Jenkins (#1285) @ajschmidt8
- Remove padding from device_memory_resource (#1278) @vyasr
- Unpin scikit-build upper bound (#1275) @vyasr
- RMM: Build CUDA 12 packages (#1223) @bdice