Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[on hold] Replace cub::Traits by numeric_limits and deprecate it #3384

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

bernhardmgruber
Copy link
Contributor

@bernhardmgruber bernhardmgruber commented Jan 14, 2025

Fixes: #3381

@bernhardmgruber bernhardmgruber added the cub For all items related to CUB label Jan 14, 2025
@bernhardmgruber
Copy link
Contributor Author

/ok to test

@bernhardmgruber
Copy link
Contributor Author

bernhardmgruber commented Jan 14, 2025

@miscco I would love to deprecate cub::Traits in favor of standard facilities in libcu++. As it currently stands, we would still need:

  • support for FP16, BF16 and FP8 types by cuda::std::is_floating_point
  • support for FP16, BF16 and FP8 types by cuda::std::numeric_limits (only min and lowest)

Do you think it's possible we can have this support soonish?

@bernhardmgruber bernhardmgruber force-pushed the depr_cub_traits branch 7 times, most recently from cdf13ed to ac81fd5 Compare January 22, 2025 15:50
Copy link

copy-pr-bot bot commented Jan 22, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@bernhardmgruber bernhardmgruber changed the title Deprecate cub::Traits Replace cub::Traits by numeric_limits and deprecate it Jan 22, 2025
@bernhardmgruber bernhardmgruber changed the title Replace cub::Traits by numeric_limits and deprecate it Replace cub::Traits by numeric_limits and deprecate it Jan 22, 2025
@bernhardmgruber
Copy link
Contributor Author

/ok to test

@bernhardmgruber
Copy link
Contributor Author

/ok to test

@bernhardmgruber bernhardmgruber marked this pull request as ready for review January 22, 2025 19:28
@bernhardmgruber bernhardmgruber requested review from a team as code owners January 22, 2025 19:28
Copy link
Contributor

🟨 CI finished in 4h 49m: Pass: 91%/78 | Total: 2d 06h | Avg: 41m 37s | Max: 1h 14m | Hits: 183%/11826
  • 🟨 cub: Pass: 81%/38 | Total: 1d 08h | Avg: 51m 44s | Max: 1h 14m | Hits: 81%/2646

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  80%/36  | Total:  1d 06h | Avg: 50m 51s | Max:  1h 14m | Hits:  81%/2646  
      🟩 arm64              Pass: 100%/2   | Total:  2h 15m | Avg:  1h 07m | Max:  1h 09m
    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total:  2h 00m | Avg:  1h 00m | Max:  1h 01m
      🔍 nvcc               Pass:  80%/36  | Total:  1d 06h | Avg: 51m 16s | Max:  1h 14m | Hits:  81%/2646  
    🔍 gpu: v100 🔍
      🟩 h100               Pass: 100%/2   | Total: 45m 20s | Avg: 22m 40s | Max: 25m 54s
      🔍 v100               Pass:  80%/36  | Total:  1d 08h | Avg: 53m 21s | Max:  1h 14m | Hits:  81%/2646  
    🟨 ctk
      🟥 12.0               Pass:   0%/5   | Total:  3h 18m | Avg: 39m 43s | Max:  1h 01m
      🟩 12.5               Pass: 100%/2   | Total:  2h 27m | Avg:  1h 13m | Max:  1h 14m
      🟨 12.6               Pass:  93%/31  | Total:  1d 03h | Avg: 52m 15s | Max:  1h 11m | Hits:  81%/2646  
    🟨 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  2h 00m | Avg:  1h 00m | Max:  1h 01m
      🟥 nvcc12.0           Pass:   0%/5   | Total:  3h 18m | Avg: 39m 43s | Max:  1h 01m
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 27m | Avg:  1h 13m | Max:  1h 14m
      🟨 nvcc12.6           Pass:  93%/29  | Total:  1d 00h | Avg: 51m 42s | Max:  1h 11m | Hits:  81%/2646  
    🟨 cxx
      🟨 Clang14            Pass:  50%/4   | Total:  3h 10m | Avg: 47m 32s | Max:  1h 02m
      🟩 Clang15            Pass: 100%/1   | Total:  1h 02m | Avg:  1h 02m | Max:  1h 02m
      🟩 Clang16            Pass: 100%/1   | Total:  1h 01m | Avg:  1h 01m | Max:  1h 01m
      🟩 Clang17            Pass: 100%/1   | Total: 59m 06s | Avg: 59m 06s | Max: 59m 06s
      🟨 Clang18            Pass:  85%/7   | Total:  6h 20m | Avg: 54m 19s | Max:  1h 09m
      🟨 GCC7               Pass:  50%/2   | Total:  1h 35m | Avg: 47m 52s | Max:  1h 02m
      🟩 GCC8               Pass: 100%/1   | Total:  1h 02m | Avg:  1h 02m | Max:  1h 02m
      🟨 GCC9               Pass:  50%/2   | Total:  1h 35m | Avg: 47m 35s | Max:  1h 01m
      🟩 GCC10              Pass: 100%/1   | Total:  1h 03m | Avg:  1h 03m | Max:  1h 03m
      🟩 GCC11              Pass: 100%/1   | Total: 56m 34s | Avg: 56m 34s | Max: 56m 34s
      🟩 GCC12              Pass: 100%/3   | Total:  1h 47m | Avg: 35m 52s | Max:  1h 02m
      🟨 GCC13              Pass:  87%/8   | Total:  5h 12m | Avg: 39m 06s | Max:  1h 06m
      🟨 MSVC14.29          Pass:  50%/2   | Total:  2h 11m | Avg:  1h 05m | Max:  1h 10m | Hits:  84%/882   
      🟩 MSVC14.39          Pass: 100%/2   | Total:  2h 19m | Avg:  1h 09m | Max:  1h 11m | Hits:  80%/1764  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 27m | Avg:  1h 13m | Max:  1h 14m
    🟨 cxx_family
      🟨 Clang              Pass:  78%/14  | Total: 12h 33m | Avg: 53m 50s | Max:  1h 09m
      🟨 GCC                Pass:  83%/18  | Total: 13h 14m | Avg: 44m 07s | Max:  1h 06m
      🟨 MSVC               Pass:  75%/4   | Total:  4h 30m | Avg:  1h 07m | Max:  1h 11m | Hits:  81%/2646  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 27m | Avg:  1h 13m | Max:  1h 14m
    🟨 jobs
      🟨 Build              Pass:  83%/31  | Total:  1d 05h | Avg: 57m 03s | Max:  1h 14m | Hits:  81%/2646  
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 29m 04s | Avg: 29m 04s | Max: 29m 04s
      🟩 GraphCapture       Pass: 100%/1   | Total: 17m 49s | Avg: 17m 49s | Max: 17m 49s
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 18m | Avg: 26m 15s | Max: 30m 00s
      🟥 TestGPU            Pass:   0%/2   | Total:  1h 12m | Avg: 36m 02s | Max: 45m 58s
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 45m 20s | Avg: 22m 40s | Max: 25m 54s
      🟩 90a                Pass: 100%/1   | Total: 26m 26s | Avg: 26m 26s | Max: 26m 26s
    🟨 std
      🟨 17                 Pass:  71%/14  | Total: 13h 18m | Avg: 57m 02s | Max:  1h 13m | Hits:  84%/1764  
      🟨 20                 Pass:  87%/24  | Total: 19h 27m | Avg: 48m 39s | Max:  1h 14m | Hits:  77%/882   
    
  • 🟩 thrust: Pass: 100%/37 | Total: 20h 24m | Avg: 33m 05s | Max: 1h 03m | Hits: 212%/9180

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 41m 31s | Avg: 20m 45s | Max: 27m 07s
    🟩 cpu
      🟩 amd64              Pass: 100%/35  | Total: 19h 24m | Avg: 33m 16s | Max:  1h 03m | Hits: 212%/9180  
      🟩 arm64              Pass: 100%/2   | Total: 59m 22s | Avg: 29m 41s | Max: 31m 07s
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  3h 03m | Avg: 36m 46s | Max: 53m 41s | Hits: 173%/1836  
      🟩 12.5               Pass: 100%/2   | Total:  1h 56m | Avg: 58m 23s | Max: 59m 22s
      🟩 12.6               Pass: 100%/30  | Total: 15h 23m | Avg: 30m 47s | Max:  1h 03m | Hits: 221%/7344  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 52m 56s | Avg: 26m 28s | Max: 26m 57s
      🟩 nvcc12.0           Pass: 100%/5   | Total:  3h 03m | Avg: 36m 46s | Max: 53m 41s | Hits: 173%/1836  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  1h 56m | Avg: 58m 23s | Max: 59m 22s
      🟩 nvcc12.6           Pass: 100%/28  | Total: 14h 30m | Avg: 31m 05s | Max:  1h 03m | Hits: 221%/7344  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 52m 56s | Avg: 26m 28s | Max: 26m 57s
      🟩 nvcc               Pass: 100%/35  | Total: 19h 31m | Avg: 33m 28s | Max:  1h 03m | Hits: 212%/9180  
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  2h 07m | Avg: 31m 55s | Max: 33m 27s
      🟩 Clang15            Pass: 100%/1   | Total: 32m 16s | Avg: 32m 16s | Max: 32m 16s
      🟩 Clang16            Pass: 100%/1   | Total: 31m 34s | Avg: 31m 34s | Max: 31m 34s
      🟩 Clang17            Pass: 100%/1   | Total: 29m 57s | Avg: 29m 57s | Max: 29m 57s
      🟩 Clang18            Pass: 100%/7   | Total:  2h 40m | Avg: 22m 52s | Max: 30m 12s
      🟩 GCC7               Pass: 100%/2   | Total:  1h 04m | Avg: 32m 07s | Max: 32m 36s
      🟩 GCC8               Pass: 100%/1   | Total: 32m 55s | Avg: 32m 55s | Max: 32m 55s
      🟩 GCC9               Pass: 100%/2   | Total:  1h 08m | Avg: 34m 19s | Max: 35m 45s
      🟩 GCC10              Pass: 100%/1   | Total: 35m 15s | Avg: 35m 15s | Max: 35m 15s
      🟩 GCC11              Pass: 100%/1   | Total: 36m 31s | Avg: 36m 31s | Max: 36m 31s
      🟩 GCC12              Pass: 100%/1   | Total: 36m 03s | Avg: 36m 03s | Max: 36m 03s
      🟩 GCC13              Pass: 100%/8   | Total:  3h 01m | Avg: 22m 42s | Max: 38m 59s
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 47m | Avg: 53m 54s | Max: 54m 08s | Hits: 173%/3672  
      🟩 MSVC14.39          Pass: 100%/3   | Total:  2h 42m | Avg: 54m 17s | Max:  1h 03m | Hits: 237%/5508  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  1h 56m | Avg: 58m 23s | Max: 59m 22s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/14  | Total:  6h 21m | Avg: 27m 15s | Max: 33m 27s
      🟩 GCC                Pass: 100%/16  | Total:  7h 35m | Avg: 28m 27s | Max: 38m 59s
      🟩 MSVC               Pass: 100%/5   | Total:  4h 30m | Avg: 54m 08s | Max:  1h 03m | Hits: 212%/9180  
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 56m | Avg: 58m 23s | Max: 59m 22s
    🟩 gpu
      🟩 v100               Pass: 100%/37  | Total: 20h 24m | Avg: 33m 05s | Max:  1h 03m | Hits: 212%/9180  
    🟩 jobs
      🟩 Build              Pass: 100%/31  | Total: 18h 54m | Avg: 36m 36s | Max:  1h 03m | Hits: 173%/7344  
      🟩 TestCPU            Pass: 100%/3   | Total: 50m 57s | Avg: 16m 59s | Max: 35m 42s | Hits: 365%/1836  
      🟩 TestGPU            Pass: 100%/3   | Total: 38m 20s | Avg: 12m 46s | Max: 14m 24s
    🟩 sm
      🟩 90a                Pass: 100%/1   | Total: 18m 32s | Avg: 18m 32s | Max: 18m 32s
    🟩 std
      🟩 17                 Pass: 100%/14  | Total:  9h 05m | Avg: 38m 59s | Max:  1h 03m | Hits: 173%/5508  
      🟩 20                 Pass: 100%/21  | Total: 10h 36m | Avg: 30m 19s | Max:  1h 03m | Hits: 269%/3672  
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 9m 40s | Avg: 4m 50s | Max: 7m 28s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total:  9m 40s | Avg:  4m 50s | Max:  7m 28s
    🟩 ctk
      🟩 12.6               Pass: 100%/2   | Total:  9m 40s | Avg:  4m 50s | Max:  7m 28s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/2   | Total:  9m 40s | Avg:  4m 50s | Max:  7m 28s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total:  9m 40s | Avg:  4m 50s | Max:  7m 28s
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total:  9m 40s | Avg:  4m 50s | Max:  7m 28s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total:  9m 40s | Avg:  4m 50s | Max:  7m 28s
    🟩 gpu
      🟩 v100               Pass: 100%/2   | Total:  9m 40s | Avg:  4m 50s | Max:  7m 28s
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 12s | Avg:  2m 12s | Max:  2m 12s
      🟩 Test               Pass: 100%/1   | Total:  7m 28s | Avg:  7m 28s | Max:  7m 28s
    
  • 🟩 python: Pass: 100%/1 | Total: 46m 17s | Avg: 46m 17s | Max: 46m 17s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 46m 17s | Avg: 46m 17s | Max: 46m 17s
    🟩 ctk
      🟩 12.6               Pass: 100%/1   | Total: 46m 17s | Avg: 46m 17s | Max: 46m 17s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/1   | Total: 46m 17s | Avg: 46m 17s | Max: 46m 17s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 46m 17s | Avg: 46m 17s | Max: 46m 17s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 46m 17s | Avg: 46m 17s | Max: 46m 17s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 46m 17s | Avg: 46m 17s | Max: 46m 17s
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total: 46m 17s | Avg: 46m 17s | Max: 46m 17s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 46m 17s | Avg: 46m 17s | Max: 46m 17s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
+/- Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 78)

# Runner
53 linux-amd64-cpu16
11 linux-amd64-gpu-v100-latest-1
9 windows-amd64-cpu16
4 linux-arm64-cpu16
1 linux-amd64-gpu-h100-latest-1-testing

@bernhardmgruber
Copy link
Contributor Author

It increasingly seems that replacing cub::Traits will break a lot of behavior in CUB, since users need to move over to using and specializing numeric_limits. We should probably split this PR in the pure deprecation, which we backport to 2.8, and the replacement which should target 3.0.

@bernhardmgruber bernhardmgruber force-pushed the depr_cub_traits branch 2 times, most recently from cc83a5c to 3b27583 Compare January 26, 2025 19:59
@bernhardmgruber bernhardmgruber marked this pull request as draft February 4, 2025 23:39
Copy link

copy-pr-bot bot commented Feb 4, 2025

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@bernhardmgruber
Copy link
Contributor Author

/ok to test

@miscco miscco self-assigned this Feb 5, 2025
@miscco
Copy link
Collaborator

miscco commented Feb 5, 2025

/ok to test

@miscco miscco marked this pull request as ready for review February 5, 2025 08:07
@miscco
Copy link
Collaborator

miscco commented Feb 5, 2025

huge shout-out to @davebayer for implementing the extended floating point support for numeric_limits

Copy link
Contributor

github-actions bot commented Feb 5, 2025

🟨 CI finished in 1h 58m: Pass: 98%/151 | Total: 3d 20h | Avg: 36m 33s | Max: 1h 29m | Hits: 221%/24193
  • 🟨 cub: Pass: 95%/44 | Total: 1d 18h | Avg: 57m 44s | Max: 1h 29m | Hits: 30%/4168

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  95%/42  | Total:  1d 16h | Avg: 57m 19s | Max:  1h 29m | Hits:  30%/4168  
      🟩 arm64              Pass: 100%/2   | Total:  2h 13m | Avg:  1h 06m | Max:  1h 07m
    🔍 ctk: 12.8 🔍
      🟩 12.0               Pass: 100%/5   | Total:  5h 12m | Avg:  1h 02m | Max:  1h 08m | Hits:  30%/1042  
      🟩 12.5               Pass: 100%/2   | Total:  2h 28m | Avg:  1h 14m | Max:  1h 16m
      🔍 12.8               Pass:  94%/37  | Total:  1d 10h | Avg: 56m 12s | Max:  1h 29m | Hits:  30%/3126  
    🔍 cudacxx: nvcc12.8 🔍
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  2h 07m | Avg:  1h 03m | Max:  1h 04m
      🟩 nvcc12.0           Pass: 100%/5   | Total:  5h 12m | Avg:  1h 02m | Max:  1h 08m | Hits:  30%/1042  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 28m | Avg:  1h 14m | Max:  1h 16m
      🔍 nvcc12.8           Pass:  94%/35  | Total:  1d 08h | Avg: 55m 47s | Max:  1h 29m | Hits:  30%/3126  
    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total:  2h 07m | Avg:  1h 03m | Max:  1h 04m
      🔍 nvcc               Pass:  95%/42  | Total:  1d 16h | Avg: 57m 27s | Max:  1h 29m | Hits:  30%/4168  
    🔍 gpu: rtxa6000 🔍
      🟩 h100               Pass: 100%/2   | Total: 55m 58s | Avg: 27m 59s | Max: 30m 47s
      🟩 rtx2080            Pass: 100%/34  | Total:  1d 13h | Avg:  1h 05m | Max:  1h 29m | Hits:  30%/4168  
      🔍 rtxa6000           Pass:  75%/8   | Total:  4h 22m | Avg: 32m 49s | Max:  1h 06m
    🚨 jobs: TestGPU 🚨
      🟩 Build              Pass: 100%/37  | Total:  1d 15h | Avg:  1h 04m | Max:  1h 29m | Hits:  30%/4168  
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 20m 18s | Avg: 20m 18s | Max: 20m 18s
      🟩 GraphCapture       Pass: 100%/1   | Total: 16m 33s | Avg: 16m 33s | Max: 16m 33s
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 14m | Avg: 24m 54s | Max: 25m 11s
      🔥 TestGPU            Pass:   0%/2   | Total: 44m 45s | Avg: 22m 22s | Max: 22m 31s
    🔍 std: 20 🔍
      🟩 17                 Pass: 100%/20  | Total: 21h 24m | Avg:  1h 04m | Max:  1h 21m | Hits:  30%/3126  
      🔍 20                 Pass:  91%/24  | Total: 20h 56m | Avg: 52m 20s | Max:  1h 29m | Hits:  30%/1042  
    🟨 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  4h 05m | Avg:  1h 01m | Max:  1h 07m
      🟩 Clang15            Pass: 100%/2   | Total:  1h 58m | Avg: 59m 10s | Max: 59m 47s
      🟩 Clang16            Pass: 100%/2   | Total:  2h 03m | Avg:  1h 01m | Max:  1h 03m
      🟩 Clang17            Pass: 100%/2   | Total:  1h 58m | Avg: 59m 28s | Max:  1h 00m
      🟨 Clang18            Pass:  85%/7   | Total:  6h 04m | Avg: 52m 05s | Max:  1h 05m
      🟩 GCC7               Pass: 100%/2   | Total:  1h 58m | Avg: 59m 24s | Max:  1h 00m
      🟩 GCC8               Pass: 100%/1   | Total:  1h 01m | Avg:  1h 01m | Max:  1h 01m
      🟩 GCC9               Pass: 100%/2   | Total:  2h 04m | Avg:  1h 02m | Max:  1h 05m
      🟩 GCC10              Pass: 100%/2   | Total:  2h 03m | Avg:  1h 01m | Max:  1h 01m
      🟩 GCC11              Pass: 100%/2   | Total:  2h 13m | Avg:  1h 06m | Max:  1h 08m
      🟩 GCC12              Pass: 100%/2   | Total:  2h 10m | Avg:  1h 05m | Max:  1h 09m
      🟨 GCC13              Pass:  90%/10  | Total:  6h 50m | Avg: 41m 03s | Max:  1h 16m
      🟩 MSVC14.29          Pass: 100%/2   | Total:  2h 27m | Avg:  1h 13m | Max:  1h 18m | Hits:  30%/2084  
      🟩 MSVC14.39          Pass: 100%/2   | Total:  2h 50m | Avg:  1h 25m | Max:  1h 29m | Hits:  30%/2084  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 28m | Avg:  1h 14m | Max:  1h 16m
    🟨 cxx_family
      🟨 Clang              Pass:  94%/17  | Total: 16h 11m | Avg: 57m 08s | Max:  1h 07m
      🟨 GCC                Pass:  95%/21  | Total: 18h 23m | Avg: 52m 31s | Max:  1h 16m
      🟩 MSVC               Pass: 100%/4   | Total:  5h 17m | Avg:  1h 19m | Max:  1h 29m | Hits:  30%/4168  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 28m | Avg:  1h 14m | Max:  1h 16m
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 55m 58s | Avg: 27m 59s | Max: 30m 47s
      🟩 90;90a;100         Pass: 100%/1   | Total:  1h 16m | Avg:  1h 16m | Max:  1h 16m
    
  • 🟩 thrust: Pass: 100%/43 | Total: 1d 03h | Avg: 38m 41s | Max: 1h 19m | Hits: 126%/9230

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 42m 30s | Avg: 21m 15s | Max: 31m 21s
    🟩 cpu
      🟩 amd64              Pass: 100%/41  | Total:  1d 02h | Avg: 38m 53s | Max:  1h 19m | Hits: 126%/9230  
      🟩 arm64              Pass: 100%/2   | Total:  1h 09m | Avg: 34m 46s | Max: 36m 09s
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  3h 32m | Avg: 42m 28s | Max:  1h 05m | Hits:  90%/1846  
      🟩 12.5               Pass: 100%/2   | Total:  2h 30m | Avg:  1h 15m | Max:  1h 18m
      🟩 12.8               Pass: 100%/36  | Total: 21h 41m | Avg: 36m 08s | Max:  1h 19m | Hits: 135%/7384  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  1h 03m | Avg: 31m 51s | Max: 32m 02s
      🟩 nvcc12.0           Pass: 100%/5   | Total:  3h 32m | Avg: 42m 28s | Max:  1h 05m | Hits:  90%/1846  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 30m | Avg:  1h 15m | Max:  1h 18m
      🟩 nvcc12.8           Pass: 100%/34  | Total: 20h 37m | Avg: 36m 23s | Max:  1h 19m | Hits: 135%/7384  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  1h 03m | Avg: 31m 51s | Max: 32m 02s
      🟩 nvcc               Pass: 100%/41  | Total:  1d 02h | Avg: 39m 01s | Max:  1h 19m | Hits: 126%/9230  
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  2h 21m | Avg: 35m 23s | Max: 35m 54s
      🟩 Clang15            Pass: 100%/2   | Total:  1h 15m | Avg: 37m 41s | Max: 37m 48s
      🟩 Clang16            Pass: 100%/2   | Total:  1h 13m | Avg: 36m 55s | Max: 38m 40s
      🟩 Clang17            Pass: 100%/2   | Total:  1h 14m | Avg: 37m 21s | Max: 37m 26s
      🟩 Clang18            Pass: 100%/7   | Total:  3h 13m | Avg: 27m 35s | Max: 39m 19s
      🟩 GCC7               Pass: 100%/2   | Total:  1h 11m | Avg: 35m 56s | Max: 36m 03s
      🟩 GCC8               Pass: 100%/1   | Total: 36m 06s | Avg: 36m 06s | Max: 36m 06s
      🟩 GCC9               Pass: 100%/2   | Total:  1h 19m | Avg: 39m 58s | Max: 40m 03s
      🟩 GCC10              Pass: 100%/2   | Total:  1h 14m | Avg: 37m 22s | Max: 37m 37s
      🟩 GCC11              Pass: 100%/2   | Total:  1h 17m | Avg: 38m 48s | Max: 40m 01s
      🟩 GCC12              Pass: 100%/2   | Total:  1h 25m | Avg: 42m 31s | Max: 47m 25s
      🟩 GCC13              Pass: 100%/8   | Total:  3h 37m | Avg: 27m 14s | Max: 41m 37s
      🟩 MSVC14.29          Pass: 100%/2   | Total:  2h 10m | Avg:  1h 05m | Max:  1h 05m | Hits:  74%/3692  
      🟩 MSVC14.39          Pass: 100%/3   | Total:  3h 01m | Avg:  1h 00m | Max:  1h 19m | Hits: 161%/5538  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 30m | Avg:  1h 15m | Max:  1h 18m
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total:  9h 18m | Avg: 32m 51s | Max: 39m 19s
      🟩 GCC                Pass: 100%/19  | Total: 10h 43m | Avg: 33m 51s | Max: 47m 25s
      🟩 MSVC               Pass: 100%/5   | Total:  5h 11m | Avg:  1h 02m | Max:  1h 19m | Hits: 126%/9230  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 30m | Avg:  1h 15m | Max:  1h 18m
    🟩 gpu
      🟩 rtx2080            Pass: 100%/33  | Total: 23h 13m | Avg: 42m 13s | Max:  1h 18m | Hits:  69%/5538  
      🟩 rtx4090            Pass: 100%/10  | Total:  4h 30m | Avg: 27m 03s | Max:  1h 19m | Hits: 212%/3692  
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total:  1d 02h | Avg: 42m 45s | Max:  1h 19m | Hits:  66%/7384  
      🟩 TestCPU            Pass: 100%/3   | Total: 49m 06s | Avg: 16m 22s | Max: 33m 04s | Hits: 365%/1846  
      🟩 TestGPU            Pass: 100%/3   | Total: 32m 39s | Avg: 10m 53s | Max: 11m 23s
    🟩 sm
      🟩 90;90a;100         Pass: 100%/1   | Total: 39m 35s | Avg: 39m 35s | Max: 39m 35s
    🟩 std
      🟩 17                 Pass: 100%/20  | Total: 14h 28m | Avg: 43m 24s | Max:  1h 12m | Hits:  69%/5538  
      🟩 20                 Pass: 100%/21  | Total: 12h 33m | Avg: 35m 52s | Max:  1h 19m | Hits: 212%/3692  
    
  • 🟩 libcudacxx: Pass: 100%/41 | Total: 16h 48m | Avg: 24m 36s | Max: 50m 30s | Hits: 392%/10273

    🟩 cpu
      🟩 amd64              Pass: 100%/39  | Total: 16h 02m | Avg: 24m 40s | Max: 50m 30s | Hits: 392%/10273 
      🟩 arm64              Pass: 100%/2   | Total: 46m 21s | Avg: 23m 10s | Max: 23m 13s
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  1h 09m | Avg: 13m 54s | Max: 31m 21s | Hits: 393%/2523  
      🟩 12.5               Pass: 100%/2   | Total:  1h 12m | Avg: 36m 15s | Max: 38m 05s
      🟩 12.8               Pass: 100%/34  | Total: 14h 26m | Avg: 25m 29s | Max: 50m 30s | Hits: 391%/7750  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 44m 59s | Avg: 22m 29s | Max: 23m 48s
      🟩 nvcc12.0           Pass: 100%/5   | Total:  1h 09m | Avg: 13m 54s | Max: 31m 21s | Hits: 393%/2523  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  1h 12m | Avg: 36m 15s | Max: 38m 05s
      🟩 nvcc12.8           Pass: 100%/32  | Total: 13h 41m | Avg: 25m 41s | Max: 50m 30s | Hits: 391%/7750  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 44m 59s | Avg: 22m 29s | Max: 23m 48s
      🟩 nvcc               Pass: 100%/39  | Total: 16h 04m | Avg: 24m 43s | Max: 50m 30s | Hits: 392%/10273 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total: 59m 24s | Avg: 14m 51s | Max: 24m 08s
      🟩 Clang15            Pass: 100%/2   | Total: 50m 42s | Avg: 25m 21s | Max: 26m 43s
      🟩 Clang16            Pass: 100%/2   | Total: 49m 52s | Avg: 24m 56s | Max: 27m 45s
      🟩 Clang17            Pass: 100%/2   | Total: 50m 07s | Avg: 25m 03s | Max: 25m 33s
      🟩 Clang18            Pass: 100%/6   | Total:  2h 44m | Avg: 27m 29s | Max: 46m 56s
      🟩 GCC7               Pass: 100%/2   | Total: 29m 07s | Avg: 14m 33s | Max: 22m 30s
      🟩 GCC8               Pass: 100%/1   | Total: 22m 34s | Avg: 22m 34s | Max: 22m 34s
      🟩 GCC9               Pass: 100%/2   | Total: 43m 33s | Avg: 21m 46s | Max: 23m 11s
      🟩 GCC10              Pass: 100%/2   | Total: 47m 32s | Avg: 23m 46s | Max: 24m 26s
      🟩 GCC11              Pass: 100%/2   | Total: 47m 09s | Avg: 23m 34s | Max: 24m 33s
      🟩 GCC12              Pass: 100%/2   | Total: 47m 33s | Avg: 23m 46s | Max: 25m 30s
      🟩 GCC13              Pass: 100%/8   | Total:  3h 04m | Avg: 23m 07s | Max: 50m 30s
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 05m | Avg: 32m 45s | Max: 34m 09s | Hits: 392%/5056  
      🟩 MSVC14.39          Pass: 100%/2   | Total:  1h 13m | Avg: 36m 45s | Max: 40m 51s | Hits: 391%/5217  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  1h 12m | Avg: 36m 15s | Max: 38m 05s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/16  | Total:  6h 15m | Avg: 23m 26s | Max: 46m 56s
      🟩 GCC                Pass: 100%/19  | Total:  7h 02m | Avg: 22m 13s | Max: 50m 30s
      🟩 MSVC               Pass: 100%/4   | Total:  2h 19m | Avg: 34m 45s | Max: 40m 51s | Hits: 392%/10273 
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 12m | Avg: 36m 15s | Max: 38m 05s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/41  | Total: 16h 48m | Avg: 24m 36s | Max: 50m 30s | Hits: 392%/10273 
    🟩 jobs
      🟩 Build              Pass: 100%/36  | Total: 14h 38m | Avg: 24m 24s | Max: 40m 51s | Hits: 392%/10273 
      🟩 NVRTC              Pass: 100%/2   | Total: 30m 43s | Avg: 15m 21s | Max: 15m 41s
      🟩 Test               Pass: 100%/2   | Total:  1h 37m | Avg: 48m 43s | Max: 50m 30s
      🟩 VerifyCodegen      Pass: 100%/1   | Total:  1m 59s | Avg:  1m 59s | Max:  1m 59s
    🟩 sm
      🟩 75                 Pass: 100%/2   | Total: 30m 43s | Avg: 15m 21s | Max: 15m 41s
      🟩 90;90a;100         Pass: 100%/1   | Total: 30m 26s | Avg: 30m 26s | Max: 30m 26s
    🟩 std
      🟩 17                 Pass: 100%/21  | Total:  7h 56m | Avg: 22m 42s | Max: 34m 26s | Hits: 392%/7589  
      🟩 20                 Pass: 100%/19  | Total:  8h 50m | Avg: 27m 53s | Max: 50m 30s | Hits: 391%/2684  
    
  • 🟩 cudax: Pass: 100%/20 | Total: 4h 33m | Avg: 13m 40s | Max: 18m 53s | Hits: 78%/522

    🟩 cpu
      🟩 amd64              Pass: 100%/16  | Total:  3h 39m | Avg: 13m 42s | Max: 18m 53s | Hits:  78%/522   
      🟩 arm64              Pass: 100%/4   | Total: 54m 09s | Avg: 13m 32s | Max: 14m 29s
    🟩 ctk
      🟩 12.0               Pass: 100%/1   | Total: 10m 14s | Avg: 10m 14s | Max: 10m 14s | Hits:  80%/261   
      🟩 12.5               Pass: 100%/2   | Total: 18m 12s | Avg:  9m 06s | Max:  9m 18s
      🟩 12.8               Pass: 100%/17  | Total:  4h 05m | Avg: 14m 24s | Max: 18m 53s | Hits:  75%/261   
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/1   | Total: 10m 14s | Avg: 10m 14s | Max: 10m 14s | Hits:  80%/261   
      🟩 nvcc12.5           Pass: 100%/2   | Total: 18m 12s | Avg:  9m 06s | Max:  9m 18s
      🟩 nvcc12.8           Pass: 100%/17  | Total:  4h 05m | Avg: 14m 24s | Max: 18m 53s | Hits:  75%/261   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/20  | Total:  4h 33m | Avg: 13m 40s | Max: 18m 53s | Hits:  78%/522   
    🟩 cxx
      🟩 Clang14            Pass: 100%/1   | Total: 14m 32s | Avg: 14m 32s | Max: 14m 32s
      🟩 Clang15            Pass: 100%/1   | Total: 17m 30s | Avg: 17m 30s | Max: 17m 30s
      🟩 Clang16            Pass: 100%/1   | Total: 16m 32s | Avg: 16m 32s | Max: 16m 32s
      🟩 Clang17            Pass: 100%/1   | Total: 17m 15s | Avg: 17m 15s | Max: 17m 15s
      🟩 Clang18            Pass: 100%/4   | Total: 54m 40s | Avg: 13m 40s | Max: 16m 00s
      🟩 GCC10              Pass: 100%/1   | Total: 15m 58s | Avg: 15m 58s | Max: 15m 58s
      🟩 GCC11              Pass: 100%/1   | Total: 15m 41s | Avg: 15m 41s | Max: 15m 41s
      🟩 GCC12              Pass: 100%/2   | Total: 31m 15s | Avg: 15m 37s | Max: 18m 53s
      🟩 GCC13              Pass: 100%/4   | Total: 51m 02s | Avg: 12m 45s | Max: 14m 29s
      🟩 MSVC14.36          Pass: 100%/1   | Total: 10m 14s | Avg: 10m 14s | Max: 10m 14s | Hits:  80%/261   
      🟩 MSVC14.39          Pass: 100%/1   | Total: 10m 35s | Avg: 10m 35s | Max: 10m 35s | Hits:  75%/261   
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 18m 12s | Avg:  9m 06s | Max:  9m 18s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/8   | Total:  2h 00m | Avg: 15m 03s | Max: 17m 30s
      🟩 GCC                Pass: 100%/8   | Total:  1h 53m | Avg: 14m 14s | Max: 18m 53s
      🟩 MSVC               Pass: 100%/2   | Total: 20m 49s | Avg: 10m 24s | Max: 10m 35s | Hits:  78%/522   
      🟩 NVHPC              Pass: 100%/2   | Total: 18m 12s | Avg:  9m 06s | Max:  9m 18s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/20  | Total:  4h 33m | Avg: 13m 40s | Max: 18m 53s | Hits:  78%/522   
    🟩 jobs
      🟩 Build              Pass: 100%/18  | Total:  4h 08m | Avg: 13m 49s | Max: 18m 53s | Hits:  78%/522   
      🟩 Test               Pass: 100%/2   | Total: 24m 29s | Avg: 12m 14s | Max: 12m 22s
    🟩 sm
      🟩 90                 Pass: 100%/1   | Total: 11m 19s | Avg: 11m 19s | Max: 11m 19s
      🟩 90a                Pass: 100%/1   | Total: 12m 07s | Avg: 12m 07s | Max: 12m 07s
    🟩 std
      🟩 17                 Pass: 100%/4   | Total: 46m 03s | Avg: 11m 30s | Max: 13m 07s
      🟩 20                 Pass: 100%/16  | Total:  3h 47m | Avg: 14m 12s | Max: 18m 53s | Hits:  78%/522   
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 7m 36s | Avg: 3m 48s | Max: 5m 18s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total:  7m 36s | Avg:  3m 48s | Max:  5m 18s
    🟩 ctk
      🟩 12.8               Pass: 100%/2   | Total:  7m 36s | Avg:  3m 48s | Max:  5m 18s
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/2   | Total:  7m 36s | Avg:  3m 48s | Max:  5m 18s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total:  7m 36s | Avg:  3m 48s | Max:  5m 18s
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total:  7m 36s | Avg:  3m 48s | Max:  5m 18s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total:  7m 36s | Avg:  3m 48s | Max:  5m 18s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/2   | Total:  7m 36s | Avg:  3m 48s | Max:  5m 18s
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 18s | Avg:  2m 18s | Max:  2m 18s
      🟩 Test               Pass: 100%/1   | Total:  5m 18s | Avg:  5m 18s | Max:  5m 18s
    
  • 🟩 python: Pass: 100%/1 | Total: 26m 08s | Avg: 26m 08s | Max: 26m 08s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 26m 08s | Avg: 26m 08s | Max: 26m 08s
    🟩 ctk
      🟩 12.8               Pass: 100%/1   | Total: 26m 08s | Avg: 26m 08s | Max: 26m 08s
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/1   | Total: 26m 08s | Avg: 26m 08s | Max: 26m 08s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 26m 08s | Avg: 26m 08s | Max: 26m 08s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 26m 08s | Avg: 26m 08s | Max: 26m 08s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 26m 08s | Avg: 26m 08s | Max: 26m 08s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/1   | Total: 26m 08s | Avg: 26m 08s | Max: 26m 08s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 26m 08s | Avg: 26m 08s | Max: 26m 08s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
+/- libcu++
+/- CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
+/- Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
+/- libcu++
+/- CUB
+/- Thrust
+/- CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 151)

# Runner
108 linux-amd64-cpu16
15 windows-amd64-cpu16
10 linux-arm64-cpu16
8 linux-amd64-gpu-rtx2080-latest-1
6 linux-amd64-gpu-rtxa6000-latest-1
3 linux-amd64-gpu-rtx4090-latest-1
1 linux-amd64-gpu-h100-latest-1

@miscco
Copy link
Collaborator

miscco commented Feb 15, 2025

/pre-commit.ci autofix

@bernhardmgruber bernhardmgruber force-pushed the depr_cub_traits branch 3 times, most recently from 6971035 to 58b8d30 Compare February 19, 2025 20:33
@bernhardmgruber
Copy link
Contributor Author

/ok to test

Copy link
Contributor

🟨 CI finished in 1h 30m: Pass: 47%/158 | Total: 1d 06h | Avg: 11m 24s | Max: 53m 08s | Hits: 80%/145970
  • 🟥 cub: Pass: 0%/45 | Total: 6h 57m | Avg: 9m 16s | Max: 53m 08s

    🟥 cpu
      🟥 amd64              Pass:   0%/43  | Total:  6h 43m | Avg:  9m 23s | Max: 53m 08s
      🟥 arm64              Pass:   0%/2   | Total: 13m 24s | Avg:  6m 42s | Max:  6m 45s
    🟥 ctk
      🟥 12.0               Pass:   0%/5   | Total:  1h 08m | Avg: 13m 47s | Max: 44m 07s
      🟥 12.5               Pass:   0%/2   | Total: 18m 29s | Avg:  9m 14s | Max:  9m 27s
      🟥 12.8               Pass:   0%/38  | Total:  5h 29m | Avg:  8m 40s | Max: 53m 08s
    🟥 cudacxx
      🟥 ClangCUDA18        Pass:   0%/2   | Total: 30m 58s | Avg: 15m 29s | Max: 16m 21s
      🟥 nvcc12.0           Pass:   0%/5   | Total:  1h 08m | Avg: 13m 47s | Max: 44m 07s
      🟥 nvcc12.5           Pass:   0%/2   | Total: 18m 29s | Avg:  9m 14s | Max:  9m 27s
      🟥 nvcc12.8           Pass:   0%/36  | Total:  4h 58m | Avg:  8m 18s | Max: 53m 08s
    🟥 cudacxx_family
      🟥 ClangCUDA          Pass:   0%/2   | Total: 30m 58s | Avg: 15m 29s | Max: 16m 21s
      🟥 nvcc               Pass:   0%/43  | Total:  6h 26m | Avg:  8m 58s | Max: 53m 08s
    🟥 cxx
      🟥 Clang14            Pass:   0%/4   | Total: 25m 00s | Avg:  6m 15s | Max:  6m 38s
      🟥 Clang15            Pass:   0%/2   | Total: 12m 40s | Avg:  6m 20s | Max:  6m 24s
      🟥 Clang16            Pass:   0%/2   | Total: 12m 56s | Avg:  6m 28s | Max:  6m 29s
      🟥 Clang17            Pass:   0%/2   | Total: 12m 18s | Avg:  6m 09s | Max:  6m 11s
      🟥 Clang18            Pass:   0%/7   | Total: 50m 07s | Avg:  7m 09s | Max: 16m 21s
      🟥 GCC7               Pass:   0%/2   | Total: 11m 58s | Avg:  5m 59s | Max:  6m 01s
      🟥 GCC8               Pass:   0%/1   | Total:  6m 32s | Avg:  6m 32s | Max:  6m 32s
      🟥 GCC9               Pass:   0%/2   | Total: 12m 26s | Avg:  6m 13s | Max:  6m 24s
      🟥 GCC10              Pass:   0%/2   | Total: 12m 01s | Avg:  6m 00s | Max:  6m 16s
      🟥 GCC11              Pass:   0%/2   | Total: 11m 56s | Avg:  5m 58s | Max:  6m 06s
      🟥 GCC12              Pass:   0%/2   | Total: 11m 49s | Avg:  5m 54s | Max:  5m 56s
      🟥 GCC13              Pass:   0%/11  | Total: 30m 59s | Avg:  2m 49s | Max:  7m 06s
      🟥 MSVC14.29          Pass:   0%/2   | Total:  1h 30m | Avg: 45m 12s | Max: 46m 18s
      🟥 MSVC14.42          Pass:   0%/2   | Total:  1h 37m | Avg: 48m 48s | Max: 53m 08s
      🟥 NVHPC24.7          Pass:   0%/2   | Total: 18m 29s | Avg:  9m 14s | Max:  9m 27s
    🟥 cxx_family
      🟥 Clang              Pass:   0%/17  | Total:  1h 53m | Avg:  6m 38s | Max: 16m 21s
      🟥 GCC                Pass:   0%/22  | Total:  1h 37m | Avg:  4m 26s | Max:  7m 06s
      🟥 MSVC               Pass:   0%/4   | Total:  3h 08m | Avg: 47m 00s | Max: 53m 08s
      🟥 NVHPC              Pass:   0%/2   | Total: 18m 29s | Avg:  9m 14s | Max:  9m 27s
    🟥 gpu
      🟥 h100               Pass:   0%/3   | Total:  3m 57s | Avg:  1m 19s | Max:  3m 57s
      🟥 rtx2080            Pass:   0%/34  | Total:  6h 40m | Avg: 11m 45s | Max: 53m 08s
      🟥 rtxa6000           Pass:   0%/8   | Total: 13m 15s | Avg:  1m 39s | Max:  6m 48s
    🟥 jobs
      🟥 Build              Pass:   0%/37  | Total:  6h 57m | Avg: 11m 16s | Max: 53m 08s
      🟥 DeviceLaunch       Pass:   0%/1  
      🟥 GraphCapture       Pass:   0%/1  
      🟥 HostLaunch         Pass:   0%/3  
      🟥 TestGPU            Pass:   0%/3  
    🟥 sm
      🟥 90                 Pass:   0%/3   | Total:  3m 57s | Avg:  1m 19s | Max:  3m 57s
      🟥 90;90a;100         Pass:   0%/1   | Total:  7m 06s | Avg:  7m 06s | Max:  7m 06s
    🟥 std
      🟥 17                 Pass:   0%/20  | Total:  4h 11m | Avg: 12m 35s | Max: 46m 18s
      🟥 20                 Pass:   0%/25  | Total:  2h 45m | Avg:  6m 37s | Max: 53m 08s
    
  • 🟨 thrust: Pass: 46%/45 | Total: 13h 20m | Avg: 17m 47s | Max: 52m 58s | Hits: 76%/37389

    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total: 44m 57s | Avg: 22m 28s | Max: 22m 59s | Hits:  76%/3562  
      🔍 nvcc               Pass:  44%/43  | Total: 12h 35m | Avg: 17m 34s | Max: 52m 58s | Hits:  76%/33827 
    🟨 ctk
      🟨 12.0               Pass:  40%/5   | Total:  1h 43m | Avg: 20m 46s | Max: 39m 47s | Hits:  76%/3562  
      🟩 12.5               Pass: 100%/2   | Total:  1h 31m | Avg: 45m 41s | Max: 45m 44s | Hits:  63%/3562  
      🟨 12.8               Pass:  44%/38  | Total: 10h 05m | Avg: 15m 55s | Max: 52m 58s | Hits:  77%/30265 
    🟨 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 44m 57s | Avg: 22m 28s | Max: 22m 59s | Hits:  76%/3562  
      🟨 nvcc12.0           Pass:  40%/5   | Total:  1h 43m | Avg: 20m 46s | Max: 39m 47s | Hits:  76%/3562  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  1h 31m | Avg: 45m 41s | Max: 45m 44s | Hits:  63%/3562  
      🟨 nvcc12.8           Pass:  41%/36  | Total:  9h 20m | Avg: 15m 33s | Max: 52m 58s | Hits:  78%/26703 
    🟨 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  1h 48m | Avg: 27m 04s | Max: 28m 11s | Hits:  76%/7124  
      🟩 Clang15            Pass: 100%/2   | Total: 56m 34s | Avg: 28m 17s | Max: 30m 23s | Hits:  76%/3562  
      🟩 Clang16            Pass: 100%/2   | Total: 58m 17s | Avg: 29m 08s | Max: 31m 17s | Hits:  76%/3562  
      🟩 Clang17            Pass: 100%/2   | Total: 55m 56s | Avg: 27m 58s | Max: 28m 01s | Hits:  76%/3562  
      🟩 Clang18            Pass: 100%/7   | Total:  2h 22m | Avg: 20m 25s | Max: 29m 51s | Hits:  83%/12467 
      🟥 GCC7               Pass:   0%/2   | Total:  9m 15s | Avg:  4m 37s | Max:  4m 46s
      🟥 GCC8               Pass:   0%/1   | Total:  4m 23s | Avg:  4m 23s | Max:  4m 23s
      🟥 GCC9               Pass:   0%/2   | Total:  8m 56s | Avg:  4m 28s | Max:  4m 33s
      🟥 GCC10              Pass:   0%/2   | Total:  8m 56s | Avg:  4m 28s | Max:  4m 42s
      🟥 GCC11              Pass:   0%/2   | Total:  8m 53s | Avg:  4m 26s | Max:  4m 31s
      🟥 GCC12              Pass:   0%/2   | Total:  9m 27s | Avg:  4m 43s | Max:  4m 50s
      🟥 GCC13              Pass:   0%/10  | Total: 26m 17s | Avg:  2m 37s | Max:  5m 03s
      🟥 MSVC14.29          Pass:   0%/2   | Total:  1h 25m | Avg: 42m 32s | Max: 45m 17s
      🟨 MSVC14.42          Pass:  66%/3   | Total:  2h 05m | Avg: 41m 58s | Max: 52m 58s | Hits:  62%/3550  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  1h 31m | Avg: 45m 41s | Max: 45m 44s | Hits:  63%/3562  
    🟨 cxx_family
      🟩 Clang              Pass: 100%/17  | Total:  7h 02m | Avg: 24m 49s | Max: 31m 17s | Hits:  79%/30277 
      🟥 GCC                Pass:   0%/21  | Total:  1h 16m | Avg:  3m 37s | Max:  5m 03s
      🟨 MSVC               Pass:  40%/5   | Total:  3h 30m | Avg: 42m 11s | Max: 52m 58s | Hits:  62%/3550  
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 31m | Avg: 45m 41s | Max: 45m 44s | Hits:  63%/3562  
    🟥 cmake_options
      🟥 -DTHRUST_DISPATCH_TYPE=Force32bit Pass:   0%/2   | Total:  3m 40s | Avg:  1m 50s | Max:  3m 40s
    🟨 cpu
      🟨 amd64              Pass:  46%/43  | Total: 12h 51m | Avg: 17m 55s | Max: 52m 58s | Hits:  76%/35608 
      🟨 arm64              Pass:  50%/2   | Total: 29m 29s | Avg: 14m 44s | Max: 24m 26s | Hits:  76%/1781  
    🟨 gpu
      🟥 h100               Pass:   0%/2   | Total:  3m 23s | Avg:  1m 41s | Max:  3m 23s
      🟨 rtx2080            Pass:  48%/33  | Total: 10h 57m | Avg: 19m 54s | Max: 45m 44s | Hits:  75%/28496 
      🟨 rtx4090            Pass:  50%/10  | Total:  2h 20m | Avg: 14m 00s | Max: 52m 58s | Hits:  80%/8893  
    🟨 jobs
      🟨 Build              Pass:  47%/38  | Total: 12h 31m | Avg: 19m 46s | Max: 52m 58s | Hits:  74%/32052 
      🟨 TestCPU            Pass:  66%/3   | Total: 38m 32s | Avg: 12m 50s | Max: 30m 38s | Hits:  85%/3556  
      🟨 TestGPU            Pass:  25%/4   | Total: 10m 15s | Avg:  2m 33s | Max: 10m 15s | Hits: 100%/1781  
    🟥 sm
      🟥 90                 Pass:   0%/2   | Total:  3m 23s | Avg:  1m 41s | Max:  3m 23s
      🟥 90;90a;100         Pass:   0%/1   | Total:  4m 39s | Avg:  4m 39s | Max:  4m 39s
    🟨 std
      🟨 17                 Pass:  40%/20  | Total:  6h 38m | Avg: 19m 54s | Max: 45m 44s | Hits:  75%/14248 
      🟨 20                 Pass:  56%/23  | Total:  6h 38m | Avg: 17m 19s | Max: 52m 58s | Hits:  77%/23141 
    
  • 🟨 cudax: Pass: 54%/22 | Total: 1h 43m | Avg: 4m 42s | Max: 11m 48s | Hits: 92%/5692

    🔍 ctk: 12.8 🔍
      🟩 12.0               Pass: 100%/1   | Total:  9m 40s | Avg:  9m 40s | Max:  9m 40s | Hits:  53%/262   
      🟩 12.5               Pass: 100%/2   | Total: 13m 43s | Avg:  6m 51s | Max:  6m 56s | Hits:  84%/710   
      🔍 12.8               Pass:  47%/19  | Total:  1h 20m | Avg:  4m 13s | Max: 11m 48s | Hits:  96%/4720  
    🔍 cudacxx: nvcc12.8 🔍
      🟩 nvcc12.0           Pass: 100%/1   | Total:  9m 40s | Avg:  9m 40s | Max:  9m 40s | Hits:  53%/262   
      🟩 nvcc12.5           Pass: 100%/2   | Total: 13m 43s | Avg:  6m 51s | Max:  6m 56s | Hits:  84%/710   
      🔍 nvcc12.8           Pass:  47%/19  | Total:  1h 20m | Avg:  4m 13s | Max: 11m 48s | Hits:  96%/4720  
    🚨 cxx_family: GCC 🚨
      🟩 Clang              Pass: 100%/8   | Total: 39m 26s | Avg:  4m 55s | Max: 11m 48s | Hits:  98%/4458  
      🔥 GCC                Pass:   0%/10  | Total: 30m 20s | Avg:  3m 02s | Max:  4m 07s
      🟩 MSVC               Pass: 100%/2   | Total: 20m 02s | Avg: 10m 01s | Max: 10m 22s | Hits:  53%/524   
      🟩 NVHPC              Pass: 100%/2   | Total: 13m 43s | Avg:  6m 51s | Max:  6m 56s | Hits:  84%/710   
    🟨 cxx
      🟩 Clang14            Pass: 100%/1   | Total:  4m 09s | Avg:  4m 09s | Max:  4m 09s | Hits:  98%/559   
      🟩 Clang15            Pass: 100%/1   | Total:  4m 11s | Avg:  4m 11s | Max:  4m 11s | Hits:  98%/557   
      🟩 Clang16            Pass: 100%/1   | Total:  4m 16s | Avg:  4m 16s | Max:  4m 16s | Hits:  98%/557   
      🟩 Clang17            Pass: 100%/1   | Total:  3m 55s | Avg:  3m 55s | Max:  3m 55s | Hits:  98%/557   
      🟩 Clang18            Pass: 100%/4   | Total: 22m 55s | Avg:  5m 43s | Max: 11m 48s | Hits:  98%/2228  
      🟥 GCC10              Pass:   0%/1   | Total:  4m 07s | Avg:  4m 07s | Max:  4m 07s
      🟥 GCC11              Pass:   0%/1   | Total:  3m 57s | Avg:  3m 57s | Max:  3m 57s
      🟥 GCC12              Pass:   0%/2   | Total:  4m 04s | Avg:  2m 02s | Max:  4m 04s
      🟥 GCC13              Pass:   0%/6   | Total: 18m 12s | Avg:  3m 02s | Max:  3m 55s
      🟩 MSVC14.39          Pass: 100%/1   | Total:  9m 40s | Avg:  9m 40s | Max:  9m 40s | Hits:  53%/262   
      🟩 MSVC14.42          Pass: 100%/1   | Total: 10m 22s | Avg: 10m 22s | Max: 10m 22s | Hits:  53%/262   
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 13m 43s | Avg:  6m 51s | Max:  6m 56s | Hits:  84%/710   
    🟨 cudacxx_family
      🟨 nvcc               Pass:  54%/22  | Total:  1h 43m | Avg:  4m 42s | Max: 11m 48s | Hits:  92%/5692  
    🟨 cpu
      🟨 amd64              Pass:  55%/18  | Total:  1h 28m | Avg:  4m 55s | Max: 11m 48s | Hits:  91%/4578  
      🟨 arm64              Pass:  50%/4   | Total: 14m 57s | Avg:  3m 44s | Max:  3m 55s | Hits:  98%/1114  
    🟨 gpu
      🟥 h100               Pass:   0%/2   | Total:  3m 27s | Avg:  1m 43s | Max:  3m 27s
      🟨 rtx2080            Pass:  60%/20  | Total:  1h 40m | Avg:  5m 00s | Max: 11m 48s | Hits:  92%/5692  
    🟨 jobs
      🟨 Build              Pass:  57%/19  | Total:  1h 31m | Avg:  4m 49s | Max: 10m 22s | Hits:  92%/5135  
      🟨 Test               Pass:  33%/3   | Total: 11m 48s | Avg:  3m 56s | Max: 11m 48s | Hits: 100%/557   
    🟥 sm
      🟥 90                 Pass:   0%/3   | Total:  6m 54s | Avg:  2m 18s | Max:  3m 27s
      🟥 90a                Pass:   0%/1   | Total:  3m 34s | Avg:  3m 34s | Max:  3m 34s
    🟨 std
      🟨 17                 Pass:  50%/4   | Total: 17m 46s | Avg:  4m 26s | Max:  6m 56s | Hits:  93%/912   
      🟨 20                 Pass:  55%/18  | Total:  1h 25m | Avg:  4m 45s | Max: 11m 48s | Hits:  92%/4780  
    
  • 🟥 cccl_c_parallel: Pass: 0%/2 | Total: 2m 23s | Avg: 1m 11s | Max: 2m 23s

    🟥 cpu
      🟥 amd64              Pass:   0%/2   | Total:  2m 23s | Avg:  1m 11s | Max:  2m 23s
    🟥 ctk
      🟥 12.8               Pass:   0%/2   | Total:  2m 23s | Avg:  1m 11s | Max:  2m 23s
    🟥 cudacxx
      🟥 nvcc12.8           Pass:   0%/2   | Total:  2m 23s | Avg:  1m 11s | Max:  2m 23s
    🟥 cudacxx_family
      🟥 nvcc               Pass:   0%/2   | Total:  2m 23s | Avg:  1m 11s | Max:  2m 23s
    🟥 cxx
      🟥 GCC13              Pass:   0%/2   | Total:  2m 23s | Avg:  1m 11s | Max:  2m 23s
    🟥 cxx_family
      🟥 GCC                Pass:   0%/2   | Total:  2m 23s | Avg:  1m 11s | Max:  2m 23s
    🟥 gpu
      🟥 rtx2080            Pass:   0%/2   | Total:  2m 23s | Avg:  1m 11s | Max:  2m 23s
    🟥 jobs
      🟥 Build              Pass:   0%/1   | Total:  2m 23s | Avg:  2m 23s | Max:  2m 23s
      🟥 Test               Pass:   0%/1  
    
  • 🟨 libcudacxx: Pass: 97%/43 | Total: 7h 55m | Avg: 11m 03s | Max: 31m 14s | Hits: 81%/102889

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  97%/41  | Total:  7h 42m | Avg: 11m 16s | Max: 31m 14s | Hits:  81%/97244 
      🟩 arm64              Pass: 100%/2   | Total: 12m 57s | Avg:  6m 28s | Max:  9m 05s | Hits:  92%/5645  
    🔍 ctk: 12.8 🔍
      🟩 12.0               Pass: 100%/5   | Total: 37m 19s | Avg:  7m 27s | Max: 22m 06s | Hits:  99%/13652 
      🟩 12.5               Pass: 100%/2   | Total: 34m 20s | Avg: 17m 10s | Max: 26m 10s | Hits:  72%/5590  
      🔍 12.8               Pass:  97%/36  | Total:  6h 43m | Avg: 11m 13s | Max: 31m 14s | Hits:  79%/83647 
    🔍 cudacxx: nvcc12.8 🔍
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 40m 34s | Avg: 20m 17s | Max: 21m 51s | Hits:  26%/5610  
      🟩 nvcc12.0           Pass: 100%/5   | Total: 37m 19s | Avg:  7m 27s | Max: 22m 06s | Hits:  99%/13652 
      🟩 nvcc12.5           Pass: 100%/2   | Total: 34m 20s | Avg: 17m 10s | Max: 26m 10s | Hits:  72%/5590  
      🔍 nvcc12.8           Pass:  97%/34  | Total:  6h 03m | Avg: 10m 41s | Max: 31m 14s | Hits:  83%/78037 
    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total: 40m 34s | Avg: 20m 17s | Max: 21m 51s | Hits:  26%/5610  
      🔍 nvcc               Pass:  97%/41  | Total:  7h 14m | Avg: 10m 36s | Max: 31m 14s | Hits:  84%/97279 
    🔍 cxx: GCC13 🔍
      🟩 Clang14            Pass: 100%/4   | Total: 33m 56s | Avg:  8m 29s | Max: 21m 25s | Hits:  82%/11184 
      🟩 Clang15            Pass: 100%/2   | Total: 26m 36s | Avg: 13m 18s | Max: 22m 01s | Hits:  67%/5602  
      🟩 Clang16            Pass: 100%/2   | Total:  9m 10s | Avg:  4m 35s | Max:  4m 54s | Hits:  99%/5602  
      🟩 Clang17            Pass: 100%/2   | Total: 28m 11s | Avg: 14m 05s | Max: 23m 28s | Hits:  65%/5602  
      🟩 Clang18            Pass: 100%/6   | Total:  1h 02m | Avg: 10m 20s | Max: 21m 51s | Hits:  70%/14034 
      🟩 GCC7               Pass: 100%/2   | Total:  7m 13s | Avg:  3m 36s | Max:  3m 42s | Hits:  99%/5540  
      🟩 GCC8               Pass: 100%/1   | Total:  3m 48s | Avg:  3m 48s | Max:  3m 48s | Hits:  99%/2780  
      🟩 GCC9               Pass: 100%/2   | Total: 25m 02s | Avg: 12m 31s | Max: 21m 16s | Hits:  66%/5552  
      🟩 GCC10              Pass: 100%/2   | Total: 17m 38s | Avg:  8m 49s | Max: 13m 25s | Hits:  81%/5608  
      🟩 GCC11              Pass: 100%/2   | Total:  8m 05s | Avg:  4m 02s | Max:  4m 11s | Hits:  99%/5604  
      🟩 GCC12              Pass: 100%/2   | Total:  8m 15s | Avg:  4m 07s | Max:  4m 09s | Hits:  99%/5604  
      🔍 GCC13              Pass:  90%/10  | Total:  1h 52m | Avg: 11m 14s | Max: 31m 14s | Hits:  82%/14271 
      🟩 MSVC14.29          Pass: 100%/2   | Total: 49m 32s | Avg: 24m 46s | Max: 27m 26s | Hits:  65%/5078  
      🟩 MSVC14.42          Pass: 100%/2   | Total: 49m 18s | Avg: 24m 39s | Max: 26m 31s | Hits:  98%/5238  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 34m 20s | Avg: 17m 10s | Max: 26m 10s | Hits:  72%/5590  
    🔍 cxx_family: GCC 🔍
      🟩 Clang              Pass: 100%/16  | Total:  2h 39m | Avg:  9m 59s | Max: 23m 28s | Hits:  76%/42024 
      🔍 GCC                Pass:  95%/21  | Total:  3h 02m | Avg:  8m 41s | Max: 31m 14s | Hits:  87%/44959 
      🟩 MSVC               Pass: 100%/4   | Total:  1h 38m | Avg: 24m 42s | Max: 27m 26s | Hits:  82%/10316 
      🟩 NVHPC              Pass: 100%/2   | Total: 34m 20s | Avg: 17m 10s | Max: 26m 10s | Hits:  72%/5590  
    🔍 gpu: rtx2080 🔍
      🟩 h100               Pass: 100%/2   | Total: 17m 37s | Avg:  8m 48s | Max: 13m 25s | Hits:  99%/2912  
      🔍 rtx2080            Pass:  97%/41  | Total:  7h 37m | Avg: 11m 10s | Max: 31m 14s | Hits:  81%/99977 
    🔍 jobs: NVRTC 🔍
      🟩 Build              Pass: 100%/37  | Total:  6h 47m | Avg: 11m 01s | Max: 31m 14s | Hits:  81%/102869
      🔍 NVRTC              Pass:  50%/2   | Total: 34m 33s | Avg: 17m 16s | Max: 19m 02s | Hits:  90%/20    
      🟩 Test               Pass: 100%/3   | Total: 31m 02s | Avg: 10m 20s | Max: 13m 25s
      🟩 VerifyCodegen      Pass: 100%/1   | Total:  2m 13s | Avg:  2m 13s | Max:  2m 13s
    🔍 sm: 75 🔍
      🔍 75                 Pass:  50%/2   | Total: 34m 33s | Avg: 17m 16s | Max: 19m 02s | Hits:  90%/20    
      🟩 90                 Pass: 100%/2   | Total: 17m 37s | Avg:  8m 48s | Max: 13m 25s | Hits:  99%/2912  
      🟩 90;90a;100         Pass: 100%/1   | Total: 31m 14s | Avg: 31m 14s | Max: 31m 14s | Hits:  31%/2912  
    🔍 std: 17 🔍
      🔍 17                 Pass:  95%/21  | Total:  3h 48m | Avg: 10m 51s | Max: 27m 26s | Hits:  82%/54848 
      🟩 20                 Pass: 100%/21  | Total:  4h 05m | Avg: 11m 40s | Max: 31m 14s | Hits:  80%/48041 
    
  • 🟥 python: Pass: 0%/1 | Total: 3m 18s | Avg: 3m 18s | Max: 3m 18s

    🟥 cpu
      🟥 amd64              Pass:   0%/1   | Total:  3m 18s | Avg:  3m 18s | Max:  3m 18s
    🟥 ctk
      🟥 12.8               Pass:   0%/1   | Total:  3m 18s | Avg:  3m 18s | Max:  3m 18s
    🟥 cudacxx
      🟥 nvcc12.8           Pass:   0%/1   | Total:  3m 18s | Avg:  3m 18s | Max:  3m 18s
    🟥 cudacxx_family
      🟥 nvcc               Pass:   0%/1   | Total:  3m 18s | Avg:  3m 18s | Max:  3m 18s
    🟥 cxx
      🟥 GCC13              Pass:   0%/1   | Total:  3m 18s | Avg:  3m 18s | Max:  3m 18s
    🟥 cxx_family
      🟥 GCC                Pass:   0%/1   | Total:  3m 18s | Avg:  3m 18s | Max:  3m 18s
    🟥 gpu
      🟥 rtx2080            Pass:   0%/1   | Total:  3m 18s | Avg:  3m 18s | Max:  3m 18s
    🟥 jobs
      🟥 Test               Pass:   0%/1   | Total:  3m 18s | Avg:  3m 18s | Max:  3m 18s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
+/- libcu++
+/- CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
+/- Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
+/- libcu++
+/- CUB
+/- Thrust
+/- CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 158)

# Runner
111 linux-amd64-cpu16
15 windows-amd64-cpu16
10 linux-arm64-cpu16
8 linux-amd64-gpu-rtx2080-latest-1
6 linux-amd64-gpu-rtxa6000-latest-1
5 linux-amd64-gpu-h100-latest-1
3 linux-amd64-gpu-rtx4090-latest-1

@bernhardmgruber
Copy link
Contributor Author

#3863 got a good part of this PR merged.

* Consistently use ::cuda::std::numeric_limits in CUB

Fixes: NVIDIA#3381
@bernhardmgruber bernhardmgruber changed the title Replace cub::Traits by numeric_limits and deprecate it [on hold] Replace cub::Traits by numeric_limits and deprecate it Feb 24, 2025
@bernhardmgruber
Copy link
Contributor Author

bernhardmgruber commented Feb 24, 2025

I put this PR on hold, because I think the remaining changes are just about to replace cub::Traits with a more targeted radix sort twiddle und unsigned bits facility. I don't think these remaining changs carry their weight (moving to new facilities requires all downstream users to change), because we can continue to live with cub::Traits knowning it is a radix sort facility.

I will let this PR sit for a while in case I still want to integrate some changes, but I will eventually close it without merging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cub For all items related to CUB
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

Replace parts of cub::Traits by numeric_limits and deprecate those
2 participants