Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PoC]: Implement cuda::experimental::uninitialized_async_buffer #1854

Merged
merged 11 commits into from
Sep 17, 2024

Conversation

miscco
Copy link
Collaborator

@miscco miscco commented Jun 12, 2024

The uninitialized_async_buffer provides a stream ordered allocation of N elements of type T utilizing a cuda::mr::async_resource to allocate the storage.

The buffer takes care of alignment and deallocation of the storage. The user is required to ensure that the lifetime of the memory resource exceeds the lifetime of the buffer.

Note this is based on #1637

@miscco miscco requested review from a team as code owners June 12, 2024 13:57
Copy link
Contributor

🟨 CI finished in 12h 37m: Pass: 99%/361 | Total: 3d 04h | Avg: 12m 40s | Max: 59m 34s | Hits: 62%/520475
  • 🟨 cub: Pass: 99%/131 | Total: 20h 13m | Avg: 9m 15s | Max: 37m 43s | Hits: 99%/108193

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  99%/123 | Total: 19h 37m | Avg:  9m 34s | Max: 37m 43s | Hits:  99%/101385
      🟩 arm64              Pass: 100%/8   | Total: 36m 03s | Avg:  4m 30s | Max:  4m 50s | Hits:  99%/6808  
    🔍 ctk: 12.4 🔍
      🟩 11.1               Pass: 100%/15  | Total:  1h 34m | Avg:  6m 18s | Max: 26m 21s | Hits:  97%/11554 
      🟩 11.8               Pass: 100%/3   | Total: 14m 56s | Avg:  4m 58s | Max:  5m 11s | Hits:  99%/2553  
      🔍 12.4               Pass:  99%/113 | Total: 18h 23m | Avg:  9m 46s | Max: 37m 43s | Hits:  99%/94086 
    🔍 cudacxx_full: nvcc12.4 🔍
      🟩 clang-cuda17       Pass: 100%/2   | Total:  8m 51s | Avg:  4m 25s | Max:  4m 59s | Hits: 100%/1408  
      🟩 nvcc11.1           Pass: 100%/15  | Total:  1h 34m | Avg:  6m 18s | Max: 26m 21s | Hits:  97%/11554 
      🟩 nvcc11.8           Pass: 100%/3   | Total: 14m 56s | Avg:  4m 58s | Max:  5m 11s | Hits:  99%/2553  
      🔍 nvcc12.4           Pass:  99%/111 | Total: 18h 14m | Avg:  9m 51s | Max: 37m 43s | Hits:  99%/92678 
    🔍 cudacxx_name: nvcc 🔍
      🟩 clang-cuda         Pass: 100%/2   | Total:  8m 51s | Avg:  4m 25s | Max:  4m 59s | Hits: 100%/1408  
      🔍 nvcc               Pass:  99%/129 | Total: 20h 04m | Avg:  9m 20s | Max: 37m 43s | Hits:  99%/106785
    🔍 cxx_full: gcc13 🔍
      🟩 clang9             Pass: 100%/6   | Total: 29m 58s | Avg:  4m 59s | Max:  6m 02s | Hits: 100%/4884  
      🟩 clang10            Pass: 100%/3   | Total: 16m 37s | Avg:  5m 32s | Max:  5m 53s | Hits: 100%/2559  
      🟩 clang11            Pass: 100%/4   | Total: 19m 52s | Avg:  4m 58s | Max:  5m 19s | Hits: 100%/3412  
      🟩 clang12            Pass: 100%/4   | Total: 20m 18s | Avg:  5m 04s | Max:  5m 29s | Hits: 100%/3412  
      🟩 clang13            Pass: 100%/4   | Total: 19m 30s | Avg:  4m 52s | Max:  5m 06s | Hits: 100%/3412  
      🟩 clang14            Pass: 100%/4   | Total: 19m 35s | Avg:  4m 53s | Max:  5m 13s | Hits: 100%/3412  
      🟩 clang15            Pass: 100%/4   | Total: 21m 45s | Avg:  5m 26s | Max:  6m 04s | Hits: 100%/3404  
      🟩 clang16            Pass: 100%/4   | Total: 20m 04s | Avg:  5m 01s | Max:  5m 21s | Hits: 100%/3404  
      🟩 clang17            Pass: 100%/26  | Total:  6h 45m | Avg: 15m 36s | Max: 37m 43s | Hits:  99%/21832 
      🟩 gcc6               Pass: 100%/2   | Total:  7m 52s | Avg:  3m 56s | Max:  4m 08s | Hits:  99%/1550  
      🟩 gcc7               Pass: 100%/6   | Total: 26m 15s | Avg:  4m 22s | Max:  4m 59s | Hits:  99%/4887  
      🟩 gcc8               Pass: 100%/6   | Total: 49m 07s | Avg:  8m 11s | Max: 26m 21s | Hits:  93%/4887  
      🟩 gcc9               Pass: 100%/6   | Total: 26m 53s | Avg:  4m 28s | Max:  4m 37s | Hits:  99%/4887  
      🟩 gcc10              Pass: 100%/4   | Total: 19m 49s | Avg:  4m 57s | Max:  5m 37s | Hits:  99%/3412  
      🟩 gcc11              Pass: 100%/7   | Total: 33m 41s | Avg:  4m 48s | Max:  5m 11s | Hits:  99%/5957  
      🟩 gcc12              Pass: 100%/4   | Total: 21m 14s | Avg:  5m 18s | Max:  5m 57s | Hits:  99%/3404  
      🔍 gcc13              Pass:  96%/28  | Total:  6h 06m | Avg: 13m 06s | Max: 34m 22s | Hits:  99%/22977 
      🟩 Intel2023.2.0      Pass: 100%/3   | Total: 17m 16s | Avg:  5m 45s | Max:  6m 15s | Hits: 100%/2331  
      🟩 MSVC14.16          Pass: 100%/1   | Total: 13m 23s | Avg: 13m 23s | Max: 13m 23s | Hits:  98%/695   
      🟩 MSVC14.29          Pass: 100%/2   | Total: 22m 55s | Avg: 11m 27s | Max: 11m 52s | Hits:  98%/1390  
      🟩 MSVC14.39          Pass: 100%/3   | Total: 34m 47s | Avg: 11m 35s | Max: 11m 44s | Hits:  98%/2085  
    🔍 cxx_name: gcc 🔍
      🟩 clang              Pass: 100%/59  | Total:  9h 33m | Avg:  9m 43s | Max: 37m 43s | Hits:  99%/49731 
      🔍 gcc                Pass:  98%/63  | Total:  9h 11m | Avg:  8m 45s | Max: 34m 22s | Hits:  98%/51961 
      🟩 Intel              Pass: 100%/3   | Total: 17m 16s | Avg:  5m 45s | Max:  6m 15s | Hits: 100%/2331  
      🟩 MSVC               Pass: 100%/6   | Total:  1h 11m | Avg: 11m 50s | Max: 13m 23s | Hits:  98%/4170  
    🔍 jobs: TestGPU 🔍
      🟩 Build              Pass: 100%/99  | Total:  8h 57m | Avg:  5m 25s | Max: 26m 21s | Hits:  99%/81812 
      🟩 DeviceLaunch       Pass: 100%/8   | Total:  2h 48m | Avg: 21m 02s | Max: 26m 28s | Hits:  99%/6808  
      🟩 GraphCapture       Pass: 100%/8   | Total:  2h 10m | Avg: 16m 21s | Max: 21m 16s | Hits:  99%/6808  
      🟩 HostLaunch         Pass: 100%/8   | Total:  2h 30m | Avg: 18m 49s | Max: 24m 23s | Hits:  99%/6808  
      🔍 TestGPU            Pass:  87%/8   | Total:  3h 45m | Avg: 28m 13s | Max: 37m 43s | Hits:  99%/5957  
    🔍 os: ubuntu22.04 🔍
      🟩 ubuntu18.04        Pass: 100%/14  | Total:  1h 21m | Avg:  5m 48s | Max: 26m 21s | Hits:  97%/10859 
      🟩 ubuntu20.04        Pass: 100%/35  | Total:  2h 54m | Avg:  4m 59s | Max:  6m 02s | Hits:  99%/29855 
      🔍 ubuntu22.04        Pass:  98%/76  | Total: 14h 46m | Avg: 11m 39s | Max: 37m 43s | Hits:  99%/63309 
      🟩 windows2022        Pass: 100%/6   | Total:  1h 11m | Avg: 11m 50s | Max: 13m 23s | Hits:  98%/4170  
    🔍 std: 20 🔍
      🟩 11                 Pass: 100%/34  | Total:  4h 58m | Avg:  8m 47s | Max: 34m 38s | Hits:  99%/28503 
      🟩 14                 Pass: 100%/37  | Total:  5h 40m | Avg:  9m 11s | Max: 37m 43s | Hits:  99%/30588 
      🟩 17                 Pass: 100%/36  | Total:  5h 33m | Avg:  9m 16s | Max: 26m 21s | Hits:  98%/29822 
      🔍 20                 Pass:  95%/24  | Total:  4h 00m | Avg: 10m 01s | Max: 31m 40s | Hits:  99%/19280 
    🟨 gpu
      🟨 v100               Pass:  99%/131 | Total: 20h 13m | Avg:  9m 15s | Max: 37m 43s | Hits:  99%/108193
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total: 14m 56s | Avg:  4m 58s | Max:  5m 11s | Hits:  99%/2553  
      🟩 90a                Pass: 100%/4   | Total: 15m 22s | Avg:  3m 50s | Max:  4m 04s | Hits:  99%/3404  
    
  • 🟨 libcudacxx: Pass: 99%/112 | Total: 1d 13h | Avg: 20m 21s | Max: 59m 34s | Hits: 47%/273016

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  99%/104 | Total:  1d 11h | Avg: 20m 40s | Max: 59m 34s | Hits:  47%/250688
      🟩 arm64              Pass: 100%/8   | Total:  2h 09m | Avg: 16m 10s | Max: 19m 51s | Hits:  47%/22328 
    🔍 ctk: 12.4 🔍
      🟩 11.1               Pass: 100%/15  | Total:  5h 53m | Avg: 23m 34s | Max: 42m 30s | Hits:  45%/39735 
      🟩 11.8               Pass: 100%/3   | Total:  1h 00m | Avg: 20m 17s | Max: 22m 10s | Hits:  43%/8055  
      🔍 12.4               Pass:  98%/94  | Total:  1d 07h | Avg: 19m 50s | Max: 59m 34s | Hits:  47%/225226
    🔍 cudacxx_full: nvcc12.4 🔍
      🟩 clang-cuda17       Pass: 100%/2   | Total: 34m 45s | Avg: 17m 22s | Max: 17m 57s | Hits:  34%/6099  
      🟩 nvcc11.1           Pass: 100%/15  | Total:  5h 53m | Avg: 23m 34s | Max: 42m 30s | Hits:  45%/39735 
      🟩 nvcc11.8           Pass: 100%/3   | Total:  1h 00m | Avg: 20m 17s | Max: 22m 10s | Hits:  43%/8055  
      🔍 nvcc12.4           Pass:  98%/92  | Total:  1d 06h | Avg: 19m 53s | Max: 59m 34s | Hits:  48%/219127
    🔍 cudacxx_name: nvcc 🔍
      🟩 clang-cuda         Pass: 100%/2   | Total: 34m 45s | Avg: 17m 22s | Max: 17m 57s | Hits:  34%/6099  
      🔍 nvcc               Pass:  99%/110 | Total:  1d 13h | Avg: 20m 24s | Max: 59m 34s | Hits:  47%/266917
    🔍 cxx_full: gcc13 🔍
      🟩 clang9             Pass: 100%/6   | Total:  1h 45m | Avg: 17m 32s | Max: 29m 08s | Hits:  52%/16142 
      🟩 clang10            Pass: 100%/3   | Total:  1h 05m | Avg: 21m 53s | Max: 24m 02s | Hits:  43%/8100  
      🟩 clang11            Pass: 100%/4   | Total:  1h 18m | Avg: 19m 39s | Max: 20m 24s | Hits:  41%/11172 
      🟩 clang12            Pass: 100%/4   | Total:  1h 18m | Avg: 19m 41s | Max: 20m 49s | Hits:  41%/11172 
      🟩 clang13            Pass: 100%/4   | Total:  1h 11m | Avg: 17m 57s | Max: 20m 41s | Hits:  49%/11172 
      🟩 clang14            Pass: 100%/4   | Total:  1h 03m | Avg: 15m 51s | Max: 20m 58s | Hits:  56%/11172 
      🟩 clang15            Pass: 100%/4   | Total:  1h 02m | Avg: 15m 30s | Max: 20m 49s | Hits:  54%/11164 
      🟩 clang16            Pass: 100%/4   | Total:  1h 13m | Avg: 18m 28s | Max: 21m 37s | Hits:  46%/11164 
      🟩 clang17            Pass: 100%/14  | Total:  5h 02m | Avg: 21m 35s | Max: 45m 46s | Hits:  47%/28427 
      🟩 gcc6               Pass: 100%/2   | Total: 56m 25s | Avg: 28m 12s | Max: 41m 58s | Hits:  46%/5036  
      🟩 gcc7               Pass: 100%/6   | Total:  2h 08m | Avg: 21m 24s | Max: 42m 30s | Hits:  43%/16128 
      🟩 gcc8               Pass: 100%/6   | Total:  1h 57m | Avg: 19m 30s | Max: 39m 23s | Hits:  49%/16136 
      🟩 gcc9               Pass: 100%/6   | Total:  2h 08m | Avg: 21m 26s | Max: 39m 15s | Hits:  46%/16140 
      🟩 gcc10              Pass: 100%/4   | Total:  1h 19m | Avg: 19m 49s | Max: 23m 11s | Hits:  42%/11172 
      🟩 gcc11              Pass: 100%/7   | Total:  2h 19m | Avg: 19m 55s | Max: 22m 10s | Hits:  43%/19219 
      🟩 gcc12              Pass: 100%/4   | Total:  1h 19m | Avg: 19m 51s | Max: 22m 43s | Hits:  42%/11164 
      🔍 gcc13              Pass:  95%/21  | Total:  6h 57m | Avg: 19m 53s | Max: 59m 34s | Hits:  53%/33875 
      🟩 Intel2023.2.0      Pass: 100%/3   | Total: 55m 57s | Avg: 18m 39s | Max: 23m 38s | Hits:  49%/8090  
      🟩 MSVC14.16          Pass: 100%/1   | Total: 27m 25s | Avg: 27m 25s | Max: 27m 25s | Hits:  44%/2536  
      🟩 MSVC14.29          Pass: 100%/2   | Total: 55m 18s | Avg: 27m 39s | Max: 29m 41s | Hits:  40%/5434  
      🟩 MSVC14.39          Pass: 100%/3   | Total:  1h 32m | Avg: 30m 51s | Max: 35m 58s | Hits:  39%/8401  
    🔍 cxx_name: gcc 🔍
      🟩 clang              Pass: 100%/47  | Total: 15h 01m | Avg: 19m 11s | Max: 45m 46s | Hits:  48%/119685
      🔍 gcc                Pass:  98%/56  | Total: 19h 06m | Avg: 20m 28s | Max: 59m 34s | Hits:  47%/128870
      🟩 Intel              Pass: 100%/3   | Total: 55m 57s | Avg: 18m 39s | Max: 23m 38s | Hits:  49%/8090  
      🟩 MSVC               Pass: 100%/6   | Total:  2h 55m | Avg: 29m 13s | Max: 35m 58s | Hits:  40%/16371 
    🔍 jobs: Test 🔍
      🟩 Build              Pass: 100%/99  | Total:  1d 07h | Avg: 19m 10s | Max: 42m 30s | Hits:  47%/272996
      🟩 NVRTC              Pass: 100%/4   | Total:  1h 10m | Avg: 17m 40s | Max: 19m 43s | Hits: 100%/20    
      🔍 Test               Pass:  87%/8   | Total:  5h 08m | Avg: 38m 31s | Max: 59m 34s
      🟩 VerifyCodegen      Pass: 100%/1   | Total:  2m 15s | Avg:  2m 15s | Max:  2m 15s
    🔍 os: ubuntu22.04 🔍
      🟩 ubuntu18.04        Pass: 100%/14  | Total:  5h 26m | Avg: 23m 18s | Max: 42m 30s | Hits:  45%/37199 
      🟩 ubuntu20.04        Pass: 100%/35  | Total: 10h 47m | Avg: 18m 29s | Max: 24m 02s | Hits:  47%/96343 
      🔍 ubuntu22.04        Pass:  98%/57  | Total: 18h 50m | Avg: 19m 50s | Max: 59m 34s | Hits:  48%/123103
      🟩 windows2022        Pass: 100%/6   | Total:  2h 55m | Avg: 29m 13s | Max: 35m 58s | Hits:  40%/16371 
    🔍 std: 17 🔍
      🟩 11                 Pass: 100%/29  | Total: 10h 18m | Avg: 21m 20s | Max: 42m 30s | Hits:  59%/57966 
      🟩 14                 Pass: 100%/32  | Total: 10h 04m | Avg: 18m 53s | Max: 45m 46s | Hits:  45%/81788 
      🔍 17                 Pass:  96%/31  | Total: 10h 37m | Avg: 20m 34s | Max: 59m 34s | Hits:  45%/84134 
      🟩 20                 Pass: 100%/19  | Total:  6h 56m | Avg: 21m 54s | Max: 42m 55s | Hits:  39%/49128 
    🟨 gpu
      🟨 v100               Pass:  99%/112 | Total:  1d 13h | Avg: 20m 21s | Max: 59m 34s | Hits:  47%/273016
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total:  1h 00m | Avg: 20m 17s | Max: 22m 10s | Hits:  43%/8055  
      🟩 90a                Pass: 100%/4   | Total: 36m 19s | Avg:  9m 04s | Max: 10m 28s | Hits:  68%/11527 
    
  • 🟩 thrust: Pass: 100%/118 | Total: 18h 05m | Avg: 9m 12s | Max: 23m 50s | Hits: 62%/139266

    🟩 cpu
      🟩 amd64              Pass: 100%/110 | Total: 17h 05m | Avg:  9m 19s | Max: 23m 50s | Hits:  63%/129822
      🟩 arm64              Pass: 100%/8   | Total:  1h 00m | Avg:  7m 32s | Max:  7m 57s | Hits:  52%/9444  
    🟩 ctk
      🟩 11.1               Pass: 100%/15  | Total:  2h 06m | Avg:  8m 24s | Max: 23m 10s | Hits:  52%/17705 
      🟩 11.8               Pass: 100%/3   | Total: 24m 00s | Avg:  8m 00s | Max:  8m 08s | Hits:  54%/3543  
      🟩 12.4               Pass: 100%/100 | Total: 15h 35m | Avg:  9m 21s | Max: 23m 50s | Hits:  64%/118018
    🟩 cudacxx_full
      🟩 clang-cuda17       Pass: 100%/2   | Total: 17m 08s | Avg:  8m 34s | Max:  8m 47s | Hits:  52%/2360  
      🟩 nvcc11.1           Pass: 100%/15  | Total:  2h 06m | Avg:  8m 24s | Max: 23m 10s | Hits:  52%/17705 
      🟩 nvcc11.8           Pass: 100%/3   | Total: 24m 00s | Avg:  8m 00s | Max:  8m 08s | Hits:  54%/3543  
      🟩 nvcc12.4           Pass: 100%/98  | Total: 15h 18m | Avg:  9m 22s | Max: 23m 50s | Hits:  64%/115658
    🟩 cudacxx_name
      🟩 clang-cuda         Pass: 100%/2   | Total: 17m 08s | Avg:  8m 34s | Max:  8m 47s | Hits:  52%/2360  
      🟩 nvcc               Pass: 100%/116 | Total: 17h 48m | Avg:  9m 12s | Max: 23m 50s | Hits:  62%/136906
    🟩 cxx_full
      🟩 clang9             Pass: 100%/6   | Total: 46m 14s | Avg:  7m 42s | Max:  8m 21s | Hits:  52%/7080  
      🟩 clang10            Pass: 100%/3   | Total: 26m 42s | Avg:  8m 54s | Max:  9m 13s | Hits:  52%/3540  
      🟩 clang11            Pass: 100%/4   | Total: 32m 13s | Avg:  8m 03s | Max:  8m 18s | Hits:  52%/4720  
      🟩 clang12            Pass: 100%/4   | Total: 32m 19s | Avg:  8m 04s | Max:  8m 37s | Hits:  52%/4720  
      🟩 clang13            Pass: 100%/4   | Total: 32m 07s | Avg:  8m 01s | Max:  8m 22s | Hits:  52%/4720  
      🟩 clang14            Pass: 100%/4   | Total: 34m 58s | Avg:  8m 44s | Max:  9m 06s | Hits:  52%/4720  
      🟩 clang15            Pass: 100%/4   | Total: 32m 40s | Avg:  8m 10s | Max:  8m 25s | Hits:  52%/4720  
      🟩 clang16            Pass: 100%/4   | Total: 34m 53s | Avg:  8m 43s | Max:  8m 52s | Hits:  52%/4720  
      🟩 clang17            Pass: 100%/18  | Total:  2h 35m | Avg:  8m 38s | Max: 14m 56s | Hits:  75%/21240 
      🟩 gcc6               Pass: 100%/2   | Total: 14m 22s | Avg:  7m 11s | Max:  7m 21s | Hits:  52%/2360  
      🟩 gcc7               Pass: 100%/6   | Total: 43m 43s | Avg:  7m 17s | Max:  7m 52s | Hits:  52%/7086  
      🟩 gcc8               Pass: 100%/6   | Total: 44m 57s | Avg:  7m 29s | Max:  7m 57s | Hits:  52%/7086  
      🟩 gcc9               Pass: 100%/6   | Total: 48m 38s | Avg:  8m 06s | Max:  9m 08s | Hits:  52%/7086  
      🟩 gcc10              Pass: 100%/4   | Total: 32m 27s | Avg:  8m 06s | Max:  8m 34s | Hits:  52%/4724  
      🟩 gcc11              Pass: 100%/7   | Total: 50m 10s | Avg:  7m 10s | Max:  9m 00s | Hits:  64%/8267  
      🟩 gcc12              Pass: 100%/4   | Total: 35m 58s | Avg:  8m 59s | Max: 10m 11s | Hits:  52%/4724  
      🟩 gcc13              Pass: 100%/20  | Total:  2h 38m | Avg:  7m 56s | Max: 18m 44s | Hits:  80%/23620 
      🟩 Intel2023.2.0      Pass: 100%/3   | Total: 39m 54s | Avg: 13m 18s | Max: 14m 03s | Hits:  52%/3549  
      🟩 MSVC14.16          Pass: 100%/1   | Total: 23m 10s | Avg: 23m 10s | Max: 23m 10s | Hits:  51%/1176  
      🟩 MSVC14.29          Pass: 100%/2   | Total: 40m 44s | Avg: 20m 22s | Max: 20m 29s | Hits:  51%/2352  
      🟩 MSVC14.39          Pass: 100%/6   | Total:  2h 05m | Avg: 20m 50s | Max: 23m 50s | Hits:  75%/7056  
    🟩 cxx_name
      🟩 clang              Pass: 100%/51  | Total:  7h 07m | Avg:  8m 23s | Max: 14m 56s | Hits:  60%/60180 
      🟩 gcc                Pass: 100%/55  | Total:  7h 09m | Avg:  7m 48s | Max: 18m 44s | Hits:  64%/64953 
      🟩 Intel              Pass: 100%/3   | Total: 39m 54s | Avg: 13m 18s | Max: 14m 03s | Hits:  52%/3549  
      🟩 MSVC               Pass: 100%/9   | Total:  3h 08m | Avg: 20m 59s | Max: 23m 50s | Hits:  67%/10584 
    🟩 gpu
      🟩 v100               Pass: 100%/118 | Total: 18h 05m | Avg:  9m 12s | Max: 23m 50s | Hits:  62%/139266
    🟩 jobs
      🟩 Build              Pass: 100%/99  | Total: 14h 31m | Avg:  8m 47s | Max: 23m 50s | Hits:  55%/116850
      🟩 TestCPU            Pass: 100%/11  | Total:  1h 44m | Avg:  9m 31s | Max: 19m 33s | Hits:  99%/12972 
      🟩 TestGPU            Pass: 100%/8   | Total:  1h 49m | Avg: 13m 42s | Max: 18m 44s | Hits:  99%/9444  
    🟩 os
      🟩 ubuntu18.04        Pass: 100%/14  | Total:  1h 42m | Avg:  7m 21s | Max:  8m 04s | Hits:  52%/16529 
      🟩 ubuntu20.04        Pass: 100%/35  | Total:  4h 45m | Avg:  8m 09s | Max:  9m 13s | Hits:  52%/41313 
      🟩 ubuntu22.04        Pass: 100%/60  | Total:  8h 28m | Avg:  8m 28s | Max: 18m 44s | Hits:  70%/70840 
      🟩 windows2022        Pass: 100%/9   | Total:  3h 08m | Avg: 20m 59s | Max: 23m 50s | Hits:  67%/10584 
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total: 24m 00s | Avg:  8m 00s | Max:  8m 08s | Hits:  54%/3543  
      🟩 90a                Pass: 100%/4   | Total: 30m 33s | Avg:  7m 38s | Max:  7m 51s | Hits:  52%/4724  
    🟩 std
      🟩 11                 Pass: 100%/30  | Total:  3h 56m | Avg:  7m 52s | Max: 12m 33s | Hits:  61%/35418 
      🟩 14                 Pass: 100%/34  | Total:  5h 28m | Avg:  9m 39s | Max: 23m 10s | Hits:  61%/40122 
      🟩 17                 Pass: 100%/33  | Total:  5h 18m | Avg:  9m 39s | Max: 22m 23s | Hits:  62%/38946 
      🟩 20                 Pass: 100%/21  | Total:  3h 22m | Avg:  9m 38s | Max: 23m 50s | Hits:  66%/24780 
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
+/- libcu++
CUB
Thrust
+/- CUDA Experimental

Modifications in project or dependencies?

Project
CCCL Infrastructure
+/- libcu++
+/- CUB
+/- Thrust
+/- CUDA Experimental

🏃‍ Runner counts (total jobs: 361)

# Runner
264 linux-amd64-cpu16
52 linux-amd64-gpu-v100-latest-1
24 linux-arm64-cpu16
21 windows-amd64-cpu16

Copy link
Contributor

🟨 CI finished in 8h 46m: Pass: 99%/361 | Total: 2d 15h | Avg: 10m 34s | Max: 1h 03m | Hits: 77%/520475
  • 🟨 cub: Pass: 99%/131 | Total: 18h 49m | Avg: 8m 37s | Max: 34m 34s | Hits: 99%/108193

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  99%/123 | Total: 18h 10m | Avg:  8m 51s | Max: 34m 34s | Hits:  99%/101385
      🟩 arm64              Pass: 100%/8   | Total: 39m 01s | Avg:  4m 52s | Max:  5m 06s | Hits:  99%/6808  
    🔍 ctk: 12.4 🔍
      🟩 11.1               Pass: 100%/15  | Total:  1h 05m | Avg:  4m 22s | Max: 13m 09s | Hits:  99%/11554 
      🟩 11.8               Pass: 100%/3   | Total: 14m 06s | Avg:  4m 42s | Max:  4m 54s | Hits:  99%/2553  
      🔍 12.4               Pass:  99%/113 | Total: 17h 29m | Avg:  9m 17s | Max: 34m 34s | Hits:  99%/94086 
    🔍 cudacxx_full: nvcc12.4 🔍
      🟩 clang-cuda17       Pass: 100%/2   | Total:  7m 17s | Avg:  3m 38s | Max:  3m 43s | Hits: 100%/1408  
      🟩 nvcc11.1           Pass: 100%/15  | Total:  1h 05m | Avg:  4m 22s | Max: 13m 09s | Hits:  99%/11554 
      🟩 nvcc11.8           Pass: 100%/3   | Total: 14m 06s | Avg:  4m 42s | Max:  4m 54s | Hits:  99%/2553  
      🔍 nvcc12.4           Pass:  99%/111 | Total: 17h 22m | Avg:  9m 23s | Max: 34m 34s | Hits:  99%/92678 
    🔍 cudacxx_name: nvcc 🔍
      🟩 clang-cuda         Pass: 100%/2   | Total:  7m 17s | Avg:  3m 38s | Max:  3m 43s | Hits: 100%/1408  
      🔍 nvcc               Pass:  99%/129 | Total: 18h 41m | Avg:  8m 41s | Max: 34m 34s | Hits:  99%/106785
    🔍 cxx_full: clang17 🔍
      🟩 clang9             Pass: 100%/6   | Total: 27m 06s | Avg:  4m 31s | Max:  5m 17s | Hits: 100%/4884  
      🟩 clang10            Pass: 100%/3   | Total: 15m 14s | Avg:  5m 04s | Max:  5m 20s | Hits: 100%/2559  
      🟩 clang11            Pass: 100%/4   | Total: 17m 42s | Avg:  4m 25s | Max:  4m 41s | Hits: 100%/3412  
      🟩 clang12            Pass: 100%/4   | Total: 17m 37s | Avg:  4m 24s | Max:  4m 40s | Hits: 100%/3412  
      🟩 clang13            Pass: 100%/4   | Total: 19m 05s | Avg:  4m 46s | Max:  5m 45s | Hits: 100%/3412  
      🟩 clang14            Pass: 100%/4   | Total: 17m 59s | Avg:  4m 29s | Max:  4m 38s | Hits: 100%/3412  
      🟩 clang15            Pass: 100%/4   | Total: 18m 59s | Avg:  4m 44s | Max:  5m 10s | Hits: 100%/3404  
      🟩 clang16            Pass: 100%/4   | Total: 18m 35s | Avg:  4m 38s | Max:  4m 56s | Hits: 100%/3404  
      🔍 clang17            Pass:  96%/26  | Total:  5h 43m | Avg: 13m 13s | Max: 30m 56s | Hits: 100%/20981 
      🟩 gcc6               Pass: 100%/2   | Total:  7m 22s | Avg:  3m 41s | Max:  3m 46s | Hits:  99%/1550  
      🟩 gcc7               Pass: 100%/6   | Total: 23m 36s | Avg:  3m 56s | Max:  4m 26s | Hits:  99%/4887  
      🟩 gcc8               Pass: 100%/6   | Total: 24m 33s | Avg:  4m 05s | Max:  4m 20s | Hits:  99%/4887  
      🟩 gcc9               Pass: 100%/6   | Total: 25m 40s | Avg:  4m 16s | Max:  5m 20s | Hits:  99%/4887  
      🟩 gcc10              Pass: 100%/4   | Total: 18m 05s | Avg:  4m 31s | Max:  4m 46s | Hits:  99%/3412  
      🟩 gcc11              Pass: 100%/7   | Total: 32m 49s | Avg:  4m 41s | Max:  4m 54s | Hits:  99%/5957  
      🟩 gcc12              Pass: 100%/4   | Total: 18m 55s | Avg:  4m 43s | Max:  4m 52s | Hits:  99%/3404  
      🟩 gcc13              Pass: 100%/28  | Total:  6h 35m | Avg: 14m 07s | Max: 34m 34s | Hits:  99%/23828 
      🟩 Intel2023.2.0      Pass: 100%/3   | Total: 16m 20s | Avg:  5m 26s | Max:  5m 49s | Hits: 100%/2331  
      🟩 MSVC14.16          Pass: 100%/1   | Total: 13m 09s | Avg: 13m 09s | Max: 13m 09s | Hits:  98%/695   
      🟩 MSVC14.29          Pass: 100%/2   | Total: 22m 25s | Avg: 11m 12s | Max: 11m 18s | Hits:  98%/1390  
      🟩 MSVC14.39          Pass: 100%/3   | Total: 34m 42s | Avg: 11m 34s | Max: 12m 02s | Hits:  98%/2085  
    🔍 cxx_name: clang 🔍
      🔍 clang              Pass:  98%/59  | Total:  8h 16m | Avg:  8m 24s | Max: 30m 56s | Hits: 100%/48880 
      🟩 gcc                Pass: 100%/63  | Total:  9h 06m | Avg:  8m 40s | Max: 34m 34s | Hits:  99%/52812 
      🟩 Intel              Pass: 100%/3   | Total: 16m 20s | Avg:  5m 26s | Max:  5m 49s | Hits: 100%/2331  
      🟩 MSVC               Pass: 100%/6   | Total:  1h 10m | Avg: 11m 42s | Max: 13m 09s | Hits:  98%/4170  
    🔍 jobs: GraphCapture 🔍
      🟩 Build              Pass: 100%/99  | Total:  8h 24m | Avg:  5m 05s | Max: 21m 38s | Hits:  99%/81812 
      🟩 DeviceLaunch       Pass: 100%/8   | Total:  2h 19m | Avg: 17m 22s | Max: 25m 09s | Hits:  99%/6808  
      🔍 GraphCapture       Pass:  87%/8   | Total:  2h 06m | Avg: 15m 45s | Max: 25m 45s | Hits:  99%/5957  
      🟩 HostLaunch         Pass: 100%/8   | Total:  2h 25m | Avg: 18m 14s | Max: 22m 02s | Hits:  99%/6808  
      🟩 TestGPU            Pass: 100%/8   | Total:  3h 33m | Avg: 26m 43s | Max: 34m 34s | Hits:  99%/6808  
    🔍 os: ubuntu22.04 🔍
      🟩 ubuntu18.04        Pass: 100%/14  | Total: 52m 26s | Avg:  3m 44s | Max:  4m 03s | Hits:  99%/10859 
      🟩 ubuntu20.04        Pass: 100%/35  | Total:  2h 41m | Avg:  4m 36s | Max:  5m 45s | Hits:  99%/29855 
      🔍 ubuntu22.04        Pass:  98%/76  | Total: 14h 04m | Avg: 11m 07s | Max: 34m 34s | Hits:  99%/63309 
      🟩 windows2022        Pass: 100%/6   | Total:  1h 10m | Avg: 11m 42s | Max: 13m 09s | Hits:  98%/4170  
    🔍 std: 14 🔍
      🟩 11                 Pass: 100%/34  | Total:  4h 24m | Avg:  7m 47s | Max: 22m 23s | Hits:  99%/28503 
      🔍 14                 Pass:  97%/37  | Total:  5h 05m | Avg:  8m 15s | Max: 34m 34s | Hits:  99%/29737 
      🟩 17                 Pass: 100%/36  | Total:  4h 59m | Avg:  8m 19s | Max: 31m 03s | Hits:  99%/29822 
      🟩 20                 Pass: 100%/24  | Total:  4h 18m | Avg: 10m 47s | Max: 27m 01s | Hits:  99%/20131 
    🟨 gpu
      🟨 v100               Pass:  99%/131 | Total: 18h 49m | Avg:  8m 37s | Max: 34m 34s | Hits:  99%/108193
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total: 14m 06s | Avg:  4m 42s | Max:  4m 54s | Hits:  99%/2553  
      🟩 90a                Pass: 100%/4   | Total: 14m 45s | Avg:  3m 41s | Max:  3m 53s | Hits:  99%/3404  
    
  • 🟨 libcudacxx: Pass: 99%/112 | Total: 1d 09h | Avg: 18m 00s | Max: 1h 03m | Hits: 58%/273016

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  99%/104 | Total:  1d 07h | Avg: 18m 13s | Max:  1h 03m | Hits:  58%/250688
      🟩 arm64              Pass: 100%/8   | Total:  2h 01m | Avg: 15m 14s | Max: 22m 07s | Hits:  51%/22328 
    🔍 ctk: 12.4 🔍
      🟩 11.1               Pass: 100%/15  | Total:  5h 02m | Avg: 20m 11s | Max: 39m 56s | Hits:  52%/39735 
      🟩 11.8               Pass: 100%/3   | Total: 59m 18s | Avg: 19m 46s | Max: 22m 08s | Hits:  46%/8055  
      🔍 12.4               Pass:  98%/94  | Total:  1d 03h | Avg: 17m 36s | Max:  1h 03m | Hits:  59%/225226
    🔍 cudacxx_full: nvcc12.4 🔍
      🟩 clang-cuda17       Pass: 100%/2   | Total: 35m 55s | Avg: 17m 57s | Max: 18m 04s | Hits:  37%/6099  
      🟩 nvcc11.1           Pass: 100%/15  | Total:  5h 02m | Avg: 20m 11s | Max: 39m 56s | Hits:  52%/39735 
      🟩 nvcc11.8           Pass: 100%/3   | Total: 59m 18s | Avg: 19m 46s | Max: 22m 08s | Hits:  46%/8055  
      🔍 nvcc12.4           Pass:  98%/92  | Total:  1d 02h | Avg: 17m 36s | Max:  1h 03m | Hits:  60%/219127
    🔍 cudacxx_name: nvcc 🔍
      🟩 clang-cuda         Pass: 100%/2   | Total: 35m 55s | Avg: 17m 57s | Max: 18m 04s | Hits:  37%/6099  
      🔍 nvcc               Pass:  99%/110 | Total:  1d 09h | Avg: 18m 00s | Max:  1h 03m | Hits:  58%/266917
    🔍 cxx_full: gcc13 🔍
      🟩 clang9             Pass: 100%/6   | Total:  1h 52m | Avg: 18m 46s | Max: 28m 13s | Hits:  51%/16142 
      🟩 clang10            Pass: 100%/3   | Total:  1h 04m | Avg: 21m 20s | Max: 22m 47s | Hits:  46%/8100  
      🟩 clang11            Pass: 100%/4   | Total: 50m 10s | Avg: 12m 32s | Max: 22m 31s | Hits:  74%/11172 
      🟩 clang12            Pass: 100%/4   | Total:  1h 02m | Avg: 15m 42s | Max: 20m 27s | Hits:  60%/11172 
      🟩 clang13            Pass: 100%/4   | Total: 45m 33s | Avg: 11m 23s | Max: 19m 36s | Hits:  69%/11172 
      🟩 clang14            Pass: 100%/4   | Total:  1h 02m | Avg: 15m 33s | Max: 20m 46s | Hits:  53%/11172 
      🟩 clang15            Pass: 100%/4   | Total:  1h 00m | Avg: 15m 11s | Max: 19m 32s | Hits:  61%/11164 
      🟩 clang16            Pass: 100%/4   | Total:  1h 10m | Avg: 17m 34s | Max: 19m 32s | Hits:  53%/11164 
      🟩 clang17            Pass: 100%/14  | Total:  5h 40m | Avg: 24m 19s | Max:  1h 03m | Hits:  47%/28427 
      🟩 gcc6               Pass: 100%/2   | Total: 53m 03s | Avg: 26m 31s | Max: 39m 38s | Hits:  50%/5036  
      🟩 gcc7               Pass: 100%/6   | Total:  1h 40m | Avg: 16m 41s | Max: 36m 09s | Hits:  68%/16128 
      🟩 gcc8               Pass: 100%/6   | Total:  1h 49m | Avg: 18m 13s | Max: 39m 56s | Hits:  52%/16136 
      🟩 gcc9               Pass: 100%/6   | Total:  1h 13m | Avg: 12m 14s | Max: 20m 00s | Hits:  63%/16140 
      🟩 gcc10              Pass: 100%/4   | Total: 43m 44s | Avg: 10m 56s | Max: 19m 19s | Hits:  69%/11172 
      🟩 gcc11              Pass: 100%/7   | Total:  2h 00m | Avg: 17m 13s | Max: 22m 08s | Hits:  54%/19219 
      🟩 gcc12              Pass: 100%/4   | Total:  1h 10m | Avg: 17m 34s | Max: 22m 07s | Hits:  49%/11164 
      🔍 gcc13              Pass:  95%/21  | Total:  5h 55m | Avg: 16m 56s | Max: 42m 05s | Hits:  68%/33875 
      🟩 Intel2023.2.0      Pass: 100%/3   | Total:  1h 03m | Avg: 21m 02s | Max: 21m 46s | Hits:  46%/8090  
      🟩 MSVC14.16          Pass: 100%/1   | Total: 24m 51s | Avg: 24m 51s | Max: 24m 51s | Hits:  50%/2536  
      🟩 MSVC14.29          Pass: 100%/2   | Total: 52m 04s | Avg: 26m 02s | Max: 32m 29s | Hits:  56%/5434  
      🟩 MSVC14.39          Pass: 100%/3   | Total:  1h 22m | Avg: 27m 27s | Max: 32m 43s | Hits:  46%/8401  
    🔍 cxx_name: gcc 🔍
      🟩 clang              Pass: 100%/47  | Total: 14h 29m | Avg: 18m 29s | Max:  1h 03m | Hits:  56%/119685
      🔍 gcc                Pass:  98%/56  | Total: 15h 26m | Avg: 16m 32s | Max: 42m 05s | Hits:  61%/128870
      🟩 Intel              Pass: 100%/3   | Total:  1h 03m | Avg: 21m 02s | Max: 21m 46s | Hits:  46%/8090  
      🟩 MSVC               Pass: 100%/6   | Total:  2h 39m | Avg: 26m 32s | Max: 32m 43s | Hits:  50%/16371 
    🔍 jobs: Test 🔍
      🟩 Build              Pass: 100%/99  | Total:  1d 03h | Avg: 16m 30s | Max: 39m 56s | Hits:  58%/272996
      🟩 NVRTC              Pass: 100%/4   | Total:  1h 27m | Avg: 21m 49s | Max: 25m 59s | Hits: 100%/20    
      🔍 Test               Pass:  87%/8   | Total:  4h 53m | Avg: 36m 44s | Max:  1h 03m
      🟩 VerifyCodegen      Pass: 100%/1   | Total:  1m 59s | Avg:  1m 59s | Max:  1m 59s
    🔍 os: ubuntu22.04 🔍
      🟩 ubuntu18.04        Pass: 100%/14  | Total:  4h 38m | Avg: 19m 51s | Max: 39m 56s | Hits:  52%/37199 
      🟩 ubuntu20.04        Pass: 100%/35  | Total:  8h 19m | Avg: 14m 15s | Max: 22m 47s | Hits:  63%/96343 
      🔍 ubuntu22.04        Pass:  98%/57  | Total: 18h 01m | Avg: 18m 58s | Max:  1h 03m | Hits:  56%/123103
      🟩 windows2022        Pass: 100%/6   | Total:  2h 39m | Avg: 26m 32s | Max: 32m 43s | Hits:  50%/16371 
    🔍 std: 14 🔍
      🟩 11                 Pass: 100%/29  | Total:  9h 09m | Avg: 18m 56s | Max: 39m 56s | Hits:  70%/57966 
      🔍 14                 Pass:  96%/32  | Total:  9h 04m | Avg: 17m 01s | Max: 47m 44s | Hits:  55%/81788 
      🟩 17                 Pass: 100%/31  | Total:  9h 18m | Avg: 18m 00s | Max: 32m 29s | Hits:  49%/84134 
      🟩 20                 Pass: 100%/19  | Total:  6h 03m | Avg: 19m 08s | Max:  1h 03m | Hits:  60%/49128 
    🟨 gpu
      🟨 v100               Pass:  99%/112 | Total:  1d 09h | Avg: 18m 00s | Max:  1h 03m | Hits:  58%/273016
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total: 59m 18s | Avg: 19m 46s | Max: 22m 08s | Hits:  46%/8055  
      🟩 90a                Pass: 100%/4   | Total: 15m 28s | Avg:  3m 52s | Max:  4m 40s | Hits:  99%/11527 
    
  • 🟩 thrust: Pass: 100%/118 | Total: 11h 12m | Avg: 5m 42s | Max: 31m 14s | Hits: 99%/139266

    🟩 cpu
      🟩 amd64              Pass: 100%/110 | Total: 10h 39m | Avg:  5m 48s | Max: 31m 14s | Hits:  99%/129822
      🟩 arm64              Pass: 100%/8   | Total: 33m 16s | Avg:  4m 09s | Max:  4m 57s | Hits:  99%/9444  
    🟩 ctk
      🟩 11.1               Pass: 100%/15  | Total:  1h 01m | Avg:  4m 06s | Max: 14m 10s | Hits:  99%/17705 
      🟩 11.8               Pass: 100%/3   | Total: 11m 13s | Avg:  3m 44s | Max:  4m 05s | Hits:  99%/3543  
      🟩 12.4               Pass: 100%/100 | Total:  9h 59m | Avg:  5m 59s | Max: 31m 14s | Hits:  99%/118018
    🟩 cudacxx_full
      🟩 clang-cuda17       Pass: 100%/2   | Total:  7m 46s | Avg:  3m 53s | Max:  3m 57s | Hits: 100%/2360  
      🟩 nvcc11.1           Pass: 100%/15  | Total:  1h 01m | Avg:  4m 06s | Max: 14m 10s | Hits:  99%/17705 
      🟩 nvcc11.8           Pass: 100%/3   | Total: 11m 13s | Avg:  3m 44s | Max:  4m 05s | Hits:  99%/3543  
      🟩 nvcc12.4           Pass: 100%/98  | Total:  9h 52m | Avg:  6m 02s | Max: 31m 14s | Hits:  99%/115658
    🟩 cudacxx_name
      🟩 clang-cuda         Pass: 100%/2   | Total:  7m 46s | Avg:  3m 53s | Max:  3m 57s | Hits: 100%/2360  
      🟩 nvcc               Pass: 100%/116 | Total: 11h 04m | Avg:  5m 43s | Max: 31m 14s | Hits:  99%/136906
    🟩 cxx_full
      🟩 clang9             Pass: 100%/6   | Total: 24m 52s | Avg:  4m 08s | Max:  5m 12s | Hits: 100%/7080  
      🟩 clang10            Pass: 100%/3   | Total: 13m 19s | Avg:  4m 26s | Max:  4m 43s | Hits: 100%/3540  
      🟩 clang11            Pass: 100%/4   | Total: 15m 02s | Avg:  3m 45s | Max:  3m 50s | Hits: 100%/4720  
      🟩 clang12            Pass: 100%/4   | Total: 15m 07s | Avg:  3m 46s | Max:  4m 02s | Hits: 100%/4720  
      🟩 clang13            Pass: 100%/4   | Total: 15m 02s | Avg:  3m 45s | Max:  4m 08s | Hits: 100%/4720  
      🟩 clang14            Pass: 100%/4   | Total: 15m 01s | Avg:  3m 45s | Max:  4m 04s | Hits: 100%/4720  
      🟩 clang15            Pass: 100%/4   | Total: 16m 27s | Avg:  4m 06s | Max:  4m 38s | Hits: 100%/4720  
      🟩 clang16            Pass: 100%/4   | Total: 15m 43s | Avg:  3m 55s | Max:  4m 23s | Hits: 100%/4720  
      🟩 clang17            Pass: 100%/18  | Total:  2h 00m | Avg:  6m 43s | Max: 16m 33s | Hits: 100%/21240 
      🟩 gcc6               Pass: 100%/2   | Total:  6m 11s | Avg:  3m 05s | Max:  3m 08s | Hits:  99%/2360  
      🟩 gcc7               Pass: 100%/6   | Total: 48m 04s | Avg:  8m 00s | Max: 31m 14s | Hits:  86%/7086  
      🟩 gcc8               Pass: 100%/6   | Total: 22m 30s | Avg:  3m 45s | Max:  4m 04s | Hits:  99%/7086  
      🟩 gcc9               Pass: 100%/6   | Total: 20m 19s | Avg:  3m 23s | Max:  3m 47s | Hits:  99%/7086  
      🟩 gcc10              Pass: 100%/4   | Total: 14m 48s | Avg:  3m 42s | Max:  4m 00s | Hits:  99%/4724  
      🟩 gcc11              Pass: 100%/7   | Total: 26m 21s | Avg:  3m 45s | Max:  4m 05s | Hits:  99%/8267  
      🟩 gcc12              Pass: 100%/4   | Total: 16m 35s | Avg:  4m 08s | Max:  4m 33s | Hits:  99%/4724  
      🟩 gcc13              Pass: 100%/20  | Total:  1h 59m | Avg:  5m 59s | Max: 15m 28s | Hits:  99%/23620 
      🟩 Intel2023.2.0      Pass: 100%/3   | Total: 14m 44s | Avg:  4m 54s | Max:  5m 05s | Hits: 100%/3549  
      🟩 MSVC14.16          Pass: 100%/1   | Total: 14m 10s | Avg: 14m 10s | Max: 14m 10s | Hits:  98%/1176  
      🟩 MSVC14.29          Pass: 100%/2   | Total: 23m 36s | Avg: 11m 48s | Max: 11m 52s | Hits:  98%/2352  
      🟩 MSVC14.39          Pass: 100%/6   | Total:  1h 33m | Avg: 15m 39s | Max: 21m 22s | Hits:  98%/7056  
    🟩 cxx_name
      🟩 clang              Pass: 100%/51  | Total:  4h 11m | Avg:  4m 55s | Max: 16m 33s | Hits: 100%/60180 
      🟩 gcc                Pass: 100%/55  | Total:  4h 34m | Avg:  4m 59s | Max: 31m 14s | Hits:  98%/64953 
      🟩 Intel              Pass: 100%/3   | Total: 14m 44s | Avg:  4m 54s | Max:  5m 05s | Hits: 100%/3549  
      🟩 MSVC               Pass: 100%/9   | Total:  2h 11m | Avg: 14m 37s | Max: 21m 22s | Hits:  98%/10584 
    🟩 gpu
      🟩 v100               Pass: 100%/118 | Total: 11h 12m | Avg:  5m 42s | Max: 31m 14s | Hits:  99%/139266
    🟩 jobs
      🟩 Build              Pass: 100%/99  | Total:  7h 40m | Avg:  4m 39s | Max: 31m 14s | Hits:  99%/116850
      🟩 TestCPU            Pass: 100%/11  | Total:  1h 46m | Avg:  9m 42s | Max: 21m 22s | Hits:  99%/12972 
      🟩 TestGPU            Pass: 100%/8   | Total:  1h 45m | Avg: 13m 08s | Max: 16m 33s | Hits:  99%/9444  
    🟩 os
      🟩 ubuntu18.04        Pass: 100%/14  | Total: 47m 34s | Avg:  3m 23s | Max:  4m 12s | Hits:  99%/16529 
      🟩 ubuntu20.04        Pass: 100%/35  | Total:  2h 42m | Avg:  4m 38s | Max: 31m 14s | Hits:  97%/41313 
      🟩 ubuntu22.04        Pass: 100%/60  | Total:  5h 30m | Avg:  5m 30s | Max: 16m 33s | Hits:  99%/70840 
      🟩 windows2022        Pass: 100%/9   | Total:  2h 11m | Avg: 14m 37s | Max: 21m 22s | Hits:  98%/10584 
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total: 11m 13s | Avg:  3m 44s | Max:  4m 05s | Hits:  99%/3543  
      🟩 90a                Pass: 100%/4   | Total: 13m 11s | Avg:  3m 17s | Max:  3m 27s | Hits:  99%/4724  
    🟩 std
      🟩 11                 Pass: 100%/30  | Total:  2h 07m | Avg:  4m 15s | Max: 10m 58s | Hits:  99%/35418 
      🟩 14                 Pass: 100%/34  | Total:  3h 16m | Avg:  5m 46s | Max: 17m 12s | Hits:  99%/40122 
      🟩 17                 Pass: 100%/33  | Total:  3h 39m | Avg:  6m 38s | Max: 31m 14s | Hits:  97%/38946 
      🟩 20                 Pass: 100%/21  | Total:  2h 09m | Avg:  6m 09s | Max: 21m 22s | Hits:  99%/24780 
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
+/- libcu++
CUB
Thrust
+/- CUDA Experimental

Modifications in project or dependencies?

Project
CCCL Infrastructure
+/- libcu++
+/- CUB
+/- Thrust
+/- CUDA Experimental

🏃‍ Runner counts (total jobs: 361)

# Runner
264 linux-amd64-cpu16
52 linux-amd64-gpu-v100-latest-1
24 linux-arm64-cpu16
21 windows-amd64-cpu16

@miscco miscco force-pushed the uninitialized_async_buffer branch 4 times, most recently from b587381 to 717c085 Compare June 14, 2024 15:31
@miscco miscco changed the base branch from main to pull-request/1831 June 14, 2024 15:34
@miscco miscco changed the title Implement cuda::uninitialized_async_buffer [PoC]: Implement cuda::experimental::uninitialized_async_buffer Jun 14, 2024
@miscco miscco added feature request New feature or request. CUDA Next Feature intended for the Cuda Next experimental library labels Jun 14, 2024
@miscco miscco force-pushed the uninitialized_async_buffer branch from 717c085 to 38a2151 Compare June 14, 2024 15:39
@copy-pr-bot copy-pr-bot bot force-pushed the pull-request/1831 branch from 2f2f049 to 41ee97a Compare June 14, 2024 15:39
Copy link
Contributor

🟩 CI finished in 7h 20m: Pass: 100%/55 | Total: 2h 23m | Avg: 2m 37s | Max: 8m 06s | Hits: 95%/1748
  • 🟩 cudax: Pass: 100%/55 | Total: 2h 23m | Avg: 2m 37s | Max: 8m 06s | Hits: 95%/1748

    🟩 cpu
      🟩 amd64              Pass: 100%/51  | Total:  2h 15m | Avg:  2m 39s | Max:  8m 06s | Hits:  96%/1620  
      🟩 arm64              Pass: 100%/4   | Total:  8m 08s | Avg:  2m 02s | Max:  2m 12s | Hits:  90%/128   
    🟩 ctk
      🟩 12.0               Pass: 100%/23  | Total:  1h 02m | Avg:  2m 42s | Max:  7m 57s | Hits:  95%/730   
      🟩 12.4               Pass: 100%/32  | Total:  1h 21m | Avg:  2m 33s | Max:  8m 06s | Hits:  96%/1018  
    🟩 cudacxx_full
      🟩 nvcc12.0           Pass: 100%/23  | Total:  1h 02m | Avg:  2m 42s | Max:  7m 57s | Hits:  95%/730   
      🟩 nvcc12.4           Pass: 100%/32  | Total:  1h 21m | Avg:  2m 33s | Max:  8m 06s | Hits:  96%/1018  
    🟩 cudacxx_name
      🟩 nvcc               Pass: 100%/55  | Total:  2h 23m | Avg:  2m 37s | Max:  8m 06s | Hits:  95%/1748  
    🟩 cxx_full
      🟩 clang9             Pass: 100%/2   | Total:  4m 25s | Avg:  2m 12s | Max:  2m 22s | Hits: 100%/64    
      🟩 clang10            Pass: 100%/2   | Total:  4m 17s | Avg:  2m 08s | Max:  2m 14s | Hits: 100%/64    
      🟩 clang11            Pass: 100%/4   | Total:  8m 38s | Avg:  2m 09s | Max:  2m 24s | Hits: 100%/128   
      🟩 clang12            Pass: 100%/4   | Total:  8m 21s | Avg:  2m 05s | Max:  2m 08s | Hits: 100%/128   
      🟩 clang13            Pass: 100%/4   | Total:  8m 16s | Avg:  2m 04s | Max:  2m 07s | Hits: 100%/128   
      🟩 clang14            Pass: 100%/6   | Total: 17m 11s | Avg:  2m 51s | Max:  4m 30s | Hits: 100%/192   
      🟩 clang15            Pass: 100%/2   | Total:  4m 17s | Avg:  2m 08s | Max:  2m 11s | Hits: 100%/64    
      🟩 clang16            Pass: 100%/6   | Total: 16m 38s | Avg:  2m 46s | Max:  3m 59s | Hits:  97%/192   
      🟩 gcc9               Pass: 100%/2   | Total:  4m 00s | Avg:  2m 00s | Max:  2m 01s | Hits:  93%/64    
      🟩 gcc10              Pass: 100%/4   | Total:  8m 14s | Avg:  2m 03s | Max:  2m 09s | Hits:  93%/128   
      🟩 gcc11              Pass: 100%/4   | Total:  8m 04s | Avg:  2m 01s | Max:  2m 10s | Hits:  93%/128   
      🟩 gcc12              Pass: 100%/12  | Total: 33m 01s | Avg:  2m 45s | Max:  5m 08s | Hits:  92%/384   
      🟩 Intel2023.2.0      Pass: 100%/1   | Total:  2m 31s | Avg:  2m 31s | Max:  2m 31s | Hits: 100%/32    
      🟩 MSVC14.36          Pass: 100%/1   | Total:  7m 57s | Avg:  7m 57s | Max:  7m 57s | Hits:  61%/26    
      🟩 MSVC14.39          Pass: 100%/1   | Total:  8m 06s | Avg:  8m 06s | Max:  8m 06s | Hits:  69%/26    
    🟩 cxx_name
      🟩 clang              Pass: 100%/30  | Total:  1h 12m | Avg:  2m 24s | Max:  4m 30s | Hits:  99%/960   
      🟩 gcc                Pass: 100%/22  | Total: 53m 19s | Avg:  2m 25s | Max:  5m 08s | Hits:  93%/704   
      🟩 Intel              Pass: 100%/1   | Total:  2m 31s | Avg:  2m 31s | Max:  2m 31s | Hits: 100%/32    
      🟩 MSVC               Pass: 100%/2   | Total: 16m 03s | Avg:  8m 01s | Max:  8m 06s | Hits:  65%/52    
    🟩 gpu
      🟩 v100               Pass: 100%/55  | Total:  2h 23m | Avg:  2m 37s | Max:  8m 06s | Hits:  95%/1748  
    🟩 jobs
      🟩 Build              Pass: 100%/47  | Total:  1h 50m | Avg:  2m 21s | Max:  8m 06s | Hits:  95%/1492  
      🟩 Test               Pass: 100%/8   | Total: 33m 25s | Avg:  4m 10s | Max:  5m 08s | Hits:  96%/256   
    🟩 os
      🟩 ubuntu20.04        Pass: 100%/28  | Total:  1h 03m | Avg:  2m 15s | Max:  4m 30s | Hits:  98%/896   
      🟩 ubuntu22.04        Pass: 100%/25  | Total:  1h 04m | Avg:  2m 34s | Max:  5m 08s | Hits:  95%/800   
      🟩 windows2022        Pass: 100%/2   | Total: 16m 03s | Avg:  8m 01s | Max:  8m 06s | Hits:  65%/52    
    🟩 sm
      🟩 90                 Pass: 100%/1   | Total:  2m 03s | Avg:  2m 03s | Max:  2m 03s | Hits:  93%/32    
      🟩 90a                Pass: 100%/1   | Total:  2m 01s | Avg:  2m 01s | Max:  2m 01s | Hits:  93%/32    
    🟩 std
      🟩 17                 Pass: 100%/31  | Total:  1h 13m | Avg:  2m 23s | Max:  5m 08s | Hits:  96%/992   
      🟩 20                 Pass: 100%/24  | Total:  1h 09m | Avg:  2m 54s | Max:  8m 06s | Hits:  94%/756   
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental

🏃‍ Runner counts (total jobs: 55)

# Runner
41 linux-amd64-cpu16
8 linux-amd64-gpu-v100-latest-1
4 linux-arm64-cpu16
2 windows-amd64-cpu16

@miscco miscco force-pushed the uninitialized_async_buffer branch from 38a2151 to 8981c0d Compare June 17, 2024 16:45
@copy-pr-bot copy-pr-bot bot force-pushed the pull-request/1831 branch from 41ee97a to 4cf40b9 Compare June 17, 2024 16:46
Copy link
Contributor

🟩 CI finished in 1h 15m: Pass: 100%/55 | Total: 2h 25m | Avg: 2m 38s | Max: 8m 05s | Hits: 91%/1748
  • 🟩 cudax: Pass: 100%/55 | Total: 2h 25m | Avg: 2m 38s | Max: 8m 05s | Hits: 91%/1748

    🟩 cpu
      🟩 amd64              Pass: 100%/51  | Total:  2h 16m | Avg:  2m 41s | Max:  8m 05s | Hits:  91%/1620  
      🟩 arm64              Pass: 100%/4   | Total:  8m 20s | Avg:  2m 05s | Max:  2m 12s | Hits:  90%/128   
    🟩 ctk
      🟩 12.0               Pass: 100%/23  | Total:  1h 01m | Avg:  2m 41s | Max:  8m 04s | Hits:  90%/730   
      🟩 12.4               Pass: 100%/32  | Total:  1h 23m | Avg:  2m 36s | Max:  8m 05s | Hits:  91%/1018  
    🟩 cudacxx_full
      🟩 nvcc12.0           Pass: 100%/23  | Total:  1h 01m | Avg:  2m 41s | Max:  8m 04s | Hits:  90%/730   
      🟩 nvcc12.4           Pass: 100%/32  | Total:  1h 23m | Avg:  2m 36s | Max:  8m 05s | Hits:  91%/1018  
    🟩 cudacxx_name
      🟩 nvcc               Pass: 100%/55  | Total:  2h 25m | Avg:  2m 38s | Max:  8m 05s | Hits:  91%/1748  
    🟩 cxx_full
      🟩 clang9             Pass: 100%/2   | Total:  4m 34s | Avg:  2m 17s | Max:  2m 19s | Hits:  93%/64    
      🟩 clang10            Pass: 100%/2   | Total:  4m 39s | Avg:  2m 19s | Max:  2m 24s | Hits:  93%/64    
      🟩 clang11            Pass: 100%/4   | Total:  9m 01s | Avg:  2m 15s | Max:  2m 18s | Hits:  93%/128   
      🟩 clang12            Pass: 100%/4   | Total:  8m 59s | Avg:  2m 14s | Max:  2m 19s | Hits:  93%/128   
      🟩 clang13            Pass: 100%/4   | Total:  8m 55s | Avg:  2m 13s | Max:  2m 15s | Hits:  93%/128   
      🟩 clang14            Pass: 100%/6   | Total: 16m 17s | Avg:  2m 42s | Max:  3m 50s | Hits:  95%/192   
      🟩 clang15            Pass: 100%/2   | Total:  4m 22s | Avg:  2m 11s | Max:  2m 14s | Hits:  93%/64    
      🟩 clang16            Pass: 100%/6   | Total: 17m 03s | Avg:  2m 50s | Max:  4m 17s | Hits:  95%/192   
      🟩 gcc9               Pass: 100%/2   | Total:  4m 19s | Avg:  2m 09s | Max:  2m 16s | Hits:  87%/64    
      🟩 gcc10              Pass: 100%/4   | Total:  8m 19s | Avg:  2m 04s | Max:  2m 09s | Hits:  87%/128   
      🟩 gcc11              Pass: 100%/4   | Total:  8m 22s | Avg:  2m 05s | Max:  2m 08s | Hits:  87%/128   
      🟩 gcc12              Pass: 100%/12  | Total: 31m 31s | Avg:  2m 37s | Max:  3m 49s | Hits:  89%/384   
      🟩 Intel2023.2.0      Pass: 100%/1   | Total:  2m 42s | Avg:  2m 42s | Max:  2m 42s | Hits:  93%/32    
      🟩 MSVC14.36          Pass: 100%/1   | Total:  8m 04s | Avg:  8m 04s | Max:  8m 04s | Hits:  61%/26    
      🟩 MSVC14.39          Pass: 100%/1   | Total:  8m 05s | Avg:  8m 05s | Max:  8m 05s | Hits:  61%/26    
    🟩 cxx_name
      🟩 clang              Pass: 100%/30  | Total:  1h 13m | Avg:  2m 27s | Max:  4m 17s | Hits:  94%/960   
      🟩 gcc                Pass: 100%/22  | Total: 52m 31s | Avg:  2m 23s | Max:  3m 49s | Hits:  88%/704   
      🟩 Intel              Pass: 100%/1   | Total:  2m 42s | Avg:  2m 42s | Max:  2m 42s | Hits:  93%/32    
      🟩 MSVC               Pass: 100%/2   | Total: 16m 09s | Avg:  8m 04s | Max:  8m 05s | Hits:  61%/52    
    🟩 gpu
      🟩 v100               Pass: 100%/55  | Total:  2h 25m | Avg:  2m 38s | Max:  8m 05s | Hits:  91%/1748  
    🟩 jobs
      🟩 Build              Pass: 100%/47  | Total:  1h 55m | Avg:  2m 27s | Max:  8m 05s | Hits:  90%/1492  
      🟩 Test               Pass: 100%/8   | Total: 30m 01s | Avg:  3m 45s | Max:  4m 17s | Hits:  96%/256   
    🟩 os
      🟩 ubuntu20.04        Pass: 100%/28  | Total:  1h 05m | Avg:  2m 19s | Max:  3m 50s | Hits:  92%/896   
      🟩 ubuntu22.04        Pass: 100%/25  | Total:  1h 04m | Avg:  2m 33s | Max:  4m 17s | Hits:  91%/800   
      🟩 windows2022        Pass: 100%/2   | Total: 16m 09s | Avg:  8m 04s | Max:  8m 05s | Hits:  61%/52    
    🟩 sm
      🟩 90                 Pass: 100%/1   | Total:  2m 12s | Avg:  2m 12s | Max:  2m 12s | Hits:  87%/32    
      🟩 90a                Pass: 100%/1   | Total:  2m 06s | Avg:  2m 06s | Max:  2m 06s | Hits:  87%/32    
    🟩 std
      🟩 17                 Pass: 100%/31  | Total:  1h 14m | Avg:  2m 25s | Max:  4m 17s | Hits:  91%/992   
      🟩 20                 Pass: 100%/24  | Total:  1h 10m | Avg:  2m 55s | Max:  8m 05s | Hits:  90%/756   
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental

🏃‍ Runner counts (total jobs: 55)

# Runner
41 linux-amd64-cpu16
8 linux-amd64-gpu-v100-latest-1
4 linux-arm64-cpu16
2 windows-amd64-cpu16

@copy-pr-bot copy-pr-bot bot force-pushed the pull-request/1831 branch from 4cf40b9 to e9bfaca Compare June 18, 2024 13:04
@miscco miscco force-pushed the uninitialized_async_buffer branch from 8981c0d to 061ae52 Compare June 18, 2024 13:04
@miscco miscco force-pushed the uninitialized_async_buffer branch from 061ae52 to 1a23a28 Compare July 23, 2024 16:13
@miscco miscco requested review from a team as code owners July 23, 2024 16:13
Copy link
Contributor

🟨 CI finished in 7h 36m: Pass: 99%/417 | Total: 3d 15h | Avg: 12m 38s | Max: 1h 09m | Hits: 44%/34308
  • 🟨 cudax: Pass: 94%/55 | Total: 2h 39m | Avg: 2m 53s | Max: 10m 09s

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  94%/51  | Total:  2h 30m | Avg:  2m 56s | Max: 10m 09s
      🟩 arm64              Pass: 100%/4   | Total:  8m 46s | Avg:  2m 11s | Max:  2m 14s
    🔍 jobs: Build 🔍
      🔍 Build              Pass:  93%/47  | Total:  2h 07m | Avg:  2m 42s | Max: 10m 09s
      🟩 Test               Pass: 100%/8   | Total: 31m 32s | Avg:  3m 56s | Max:  4m 27s
    🟨 cxx
      🟩 Clang9             Pass: 100%/2   | Total:  4m 59s | Avg:  2m 29s | Max:  2m 33s
      🟩 Clang10            Pass: 100%/2   | Total:  4m 49s | Avg:  2m 24s | Max:  2m 28s
      🟩 Clang11            Pass: 100%/4   | Total:  9m 30s | Avg:  2m 22s | Max:  2m 25s
      🟩 Clang12            Pass: 100%/4   | Total:  9m 19s | Avg:  2m 19s | Max:  2m 24s
      🟩 Clang13            Pass: 100%/4   | Total: 10m 40s | Avg:  2m 40s | Max:  3m 01s
      🟩 Clang14            Pass: 100%/6   | Total: 16m 58s | Avg:  2m 49s | Max:  3m 41s
      🟩 Clang15            Pass: 100%/2   | Total:  5m 24s | Avg:  2m 42s | Max:  2m 50s
      🟩 Clang16            Pass: 100%/6   | Total: 17m 40s | Avg:  2m 56s | Max:  4m 27s
      🟩 GCC9               Pass: 100%/2   | Total:  4m 28s | Avg:  2m 14s | Max:  2m 14s
      🟩 GCC10              Pass: 100%/4   | Total:  9m 49s | Avg:  2m 27s | Max:  2m 39s
      🟩 GCC11              Pass: 100%/4   | Total:  9m 18s | Avg:  2m 19s | Max:  2m 32s
      🟩 GCC12              Pass: 100%/12  | Total: 34m 13s | Avg:  2m 51s | Max:  4m 27s
      🟥 Intel2023.2.0      Pass:   0%/1   | Total:  3m 10s | Avg:  3m 10s | Max:  3m 10s
      🟥 MSVC14.36          Pass:   0%/1   | Total:  8m 35s | Avg:  8m 35s | Max:  8m 35s
      🟥 MSVC14.39          Pass:   0%/1   | Total: 10m 09s | Avg: 10m 09s | Max: 10m 09s
    🟨 cxx_family
      🟩 Clang              Pass: 100%/30  | Total:  1h 19m | Avg:  2m 38s | Max:  4m 27s
      🟩 GCC                Pass: 100%/22  | Total: 57m 48s | Avg:  2m 37s | Max:  4m 27s
      🟥 Intel              Pass:   0%/1   | Total:  3m 10s | Avg:  3m 10s | Max:  3m 10s
      🟥 MSVC               Pass:   0%/2   | Total: 18m 44s | Avg:  9m 22s | Max: 10m 09s
    🟨 cudacxx_family
      🟨 nvcc               Pass:  94%/55  | Total:  2h 39m | Avg:  2m 53s | Max: 10m 09s
    🟨 gpu
      🟨 v100               Pass:  94%/55  | Total:  2h 39m | Avg:  2m 53s | Max: 10m 09s
    🟨 ctk
      🟨 12.0               Pass:  95%/23  | Total:  1h 07m | Avg:  2m 55s | Max:  8m 35s
      🟨 12.5               Pass:  93%/32  | Total:  1h 31m | Avg:  2m 52s | Max: 10m 09s
    🟨 cudacxx
      🟨 nvcc12.0           Pass:  95%/23  | Total:  1h 07m | Avg:  2m 55s | Max:  8m 35s
      🟨 nvcc12.5           Pass:  93%/32  | Total:  1h 31m | Avg:  2m 52s | Max: 10m 09s
    🟩 sm
      🟩 90                 Pass: 100%/1   | Total:  1m 59s | Avg:  1m 59s | Max:  1m 59s
      🟩 90a                Pass: 100%/1   | Total:  2m 01s | Avg:  2m 01s | Max:  2m 01s
    🟨 std
      🟨 17                 Pass:  96%/31  | Total:  1h 20m | Avg:  2m 35s | Max:  4m 27s
      🟨 20                 Pass:  91%/24  | Total:  1h 18m | Avg:  3m 16s | Max: 10m 09s
    
  • 🟩 cub: Pass: 100%/131 | Total: 1d 04h | Avg: 13m 11s | Max: 1h 09m | Hits: 50%/4296

    🟩 cpu
      🟩 amd64              Pass: 100%/123 | Total: 22h 24m | Avg: 10m 55s | Max:  1h 09m | Hits:  50%/4296  
      🟩 arm64              Pass: 100%/8   | Total:  6h 23m | Avg: 47m 56s | Max: 50m 11s
    🟩 ctk
      🟩 11.1               Pass: 100%/15  | Total:  1h 44m | Avg:  6m 57s | Max: 52m 25s | Hits:  50%/716   
      🟩 11.8               Pass: 100%/3   | Total: 13m 47s | Avg:  4m 35s | Max:  5m 05s
      🟩 12.5               Pass: 100%/113 | Total:  1d 02h | Avg: 14m 14s | Max:  1h 09m | Hits:  51%/3580  
    🟩 cudacxx
      🟩 ClangCUDA17        Pass: 100%/2   | Total:  7m 32s | Avg:  3m 46s | Max:  3m 59s
      🟩 nvcc11.1           Pass: 100%/15  | Total:  1h 44m | Avg:  6m 57s | Max: 52m 25s | Hits:  50%/716   
      🟩 nvcc11.8           Pass: 100%/3   | Total: 13m 47s | Avg:  4m 35s | Max:  5m 05s
      🟩 nvcc12.5           Pass: 100%/111 | Total:  1d 02h | Avg: 14m 26s | Max:  1h 09m | Hits:  51%/3580  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  7m 32s | Avg:  3m 46s | Max:  3m 59s
      🟩 nvcc               Pass: 100%/129 | Total:  1d 04h | Avg: 13m 20s | Max:  1h 09m | Hits:  50%/4296  
    🟩 cxx
      🟩 Clang9             Pass: 100%/6   | Total: 27m 18s | Avg:  4m 33s | Max:  5m 38s
      🟩 Clang10            Pass: 100%/3   | Total: 15m 17s | Avg:  5m 05s | Max:  5m 09s
      🟩 Clang11            Pass: 100%/4   | Total: 18m 10s | Avg:  4m 32s | Max:  4m 53s
      🟩 Clang12            Pass: 100%/4   | Total: 17m 55s | Avg:  4m 28s | Max:  4m 40s
      🟩 Clang13            Pass: 100%/4   | Total: 18m 13s | Avg:  4m 33s | Max:  5m 01s
      🟩 Clang14            Pass: 100%/4   | Total: 18m 55s | Avg:  4m 43s | Max:  5m 09s
      🟩 Clang15            Pass: 100%/4   | Total: 18m 16s | Avg:  4m 34s | Max:  5m 04s
      🟩 Clang16            Pass: 100%/4   | Total: 18m 23s | Avg:  4m 35s | Max:  5m 05s
      🟩 Clang17            Pass: 100%/26  | Total:  8h 37m | Avg: 19m 54s | Max: 47m 55s
      🟩 GCC6               Pass: 100%/2   | Total:  6m 39s | Avg:  3m 19s | Max:  3m 22s
      🟩 GCC7               Pass: 100%/6   | Total: 23m 37s | Avg:  3m 56s | Max:  4m 29s
      🟩 GCC8               Pass: 100%/6   | Total: 24m 49s | Avg:  4m 08s | Max:  4m 53s
      🟩 GCC9               Pass: 100%/6   | Total: 24m 11s | Avg:  4m 01s | Max:  4m 35s
      🟩 GCC10              Pass: 100%/4   | Total: 17m 38s | Avg:  4m 24s | Max:  4m 42s
      🟩 GCC11              Pass: 100%/7   | Total: 31m 55s | Avg:  4m 33s | Max:  5m 05s
      🟩 GCC12              Pass: 100%/4   | Total: 18m 22s | Avg:  4m 35s | Max:  5m 08s
      🟩 GCC13              Pass: 100%/28  | Total:  8h 47m | Avg: 18m 49s | Max: 50m 11s
      🟩 Intel2023.2.0      Pass: 100%/3   | Total: 15m 59s | Avg:  5m 19s | Max:  5m 34s
      🟩 MSVC14.16          Pass: 100%/1   | Total: 52m 25s | Avg: 52m 25s | Max: 52m 25s | Hits:  50%/716   
      🟩 MSVC14.29          Pass: 100%/2   | Total:  2h 04m | Avg:  1h 02m | Max:  1h 06m | Hits:  54%/1432  
      🟩 MSVC14.39          Pass: 100%/3   | Total:  3h 10m | Avg:  1h 03m | Max:  1h 09m | Hits:  48%/2148  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/59  | Total: 11h 10m | Avg: 11m 21s | Max: 47m 55s
      🟩 GCC                Pass: 100%/63  | Total: 11h 14m | Avg: 10m 42s | Max: 50m 11s
      🟩 Intel              Pass: 100%/3   | Total: 15m 59s | Avg:  5m 19s | Max:  5m 34s
      🟩 MSVC               Pass: 100%/6   | Total:  6h 07m | Avg:  1h 01m | Max:  1h 09m | Hits:  50%/4296  
    🟩 gpu
      🟩 v100               Pass: 100%/131 | Total:  1d 04h | Avg: 13m 11s | Max:  1h 09m | Hits:  50%/4296  
    🟩 jobs
      🟩 Build              Pass: 100%/99  | Total: 18h 44m | Avg: 11m 21s | Max:  1h 09m | Hits:  50%/4296  
      🟩 DeviceLaunch       Pass: 100%/8   | Total:  2h 25m | Avg: 18m 11s | Max: 20m 24s
      🟩 GraphCapture       Pass: 100%/8   | Total:  2h 00m | Avg: 15m 01s | Max: 17m 21s
      🟩 HostLaunch         Pass: 100%/8   | Total:  2h 22m | Avg: 17m 50s | Max: 19m 00s
      🟩 TestGPU            Pass: 100%/8   | Total:  3h 14m | Avg: 24m 20s | Max: 28m 13s
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total: 13m 47s | Avg:  4m 35s | Max:  5m 05s
      🟩 90a                Pass: 100%/4   | Total: 14m 46s | Avg:  3m 41s | Max:  3m 57s
    🟩 std
      🟩 11                 Pass: 100%/34  | Total:  5h 44m | Avg: 10m 07s | Max: 47m 21s
      🟩 14                 Pass: 100%/37  | Total:  8h 45m | Avg: 14m 11s | Max:  1h 00m | Hits:  42%/2148  
      🟩 17                 Pass: 100%/36  | Total:  8h 02m | Avg: 13m 24s | Max:  1h 06m | Hits:  57%/1432  
      🟩 20                 Pass: 100%/24  | Total:  6h 15m | Avg: 15m 38s | Max:  1h 09m | Hits:  63%/716   
    
  • 🟩 thrust: Pass: 100%/118 | Total: 19h 19m | Avg: 9m 49s | Max: 1h 06m | Hits: 51%/13077

    🟩 cpu
      🟩 amd64              Pass: 100%/110 | Total: 16h 16m | Avg:  8m 52s | Max:  1h 06m | Hits:  51%/13077 
      🟩 arm64              Pass: 100%/8   | Total:  3h 03m | Avg: 22m 52s | Max: 26m 32s
    🟩 ctk
      🟩 11.1               Pass: 100%/15  | Total:  1h 45m | Avg:  7m 02s | Max: 56m 14s | Hits:  43%/1453  
      🟩 11.8               Pass: 100%/3   | Total: 11m 33s | Avg:  3m 51s | Max:  4m 02s
      🟩 12.5               Pass: 100%/100 | Total: 17h 21m | Avg: 10m 25s | Max:  1h 06m | Hits:  52%/11624 
    🟩 cudacxx
      🟩 ClangCUDA17        Pass: 100%/2   | Total:  8m 25s | Avg:  4m 12s | Max:  4m 21s
      🟩 nvcc11.1           Pass: 100%/15  | Total:  1h 45m | Avg:  7m 02s | Max: 56m 14s | Hits:  43%/1453  
      🟩 nvcc11.8           Pass: 100%/3   | Total: 11m 33s | Avg:  3m 51s | Max:  4m 02s
      🟩 nvcc12.5           Pass: 100%/98  | Total: 17h 13m | Avg: 10m 32s | Max:  1h 06m | Hits:  52%/11624 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  8m 25s | Avg:  4m 12s | Max:  4m 21s
      🟩 nvcc               Pass: 100%/116 | Total: 19h 10m | Avg:  9m 55s | Max:  1h 06m | Hits:  51%/13077 
    🟩 cxx
      🟩 Clang9             Pass: 100%/6   | Total: 26m 18s | Avg:  4m 23s | Max:  4m 58s
      🟩 Clang10            Pass: 100%/3   | Total: 14m 31s | Avg:  4m 50s | Max:  4m 54s
      🟩 Clang11            Pass: 100%/4   | Total: 16m 04s | Avg:  4m 01s | Max:  4m 33s
      🟩 Clang12            Pass: 100%/4   | Total: 16m 33s | Avg:  4m 08s | Max:  4m 35s
      🟩 Clang13            Pass: 100%/4   | Total: 15m 35s | Avg:  3m 53s | Max:  4m 07s
      🟩 Clang14            Pass: 100%/4   | Total: 16m 05s | Avg:  4m 01s | Max:  4m 27s
      🟩 Clang15            Pass: 100%/4   | Total: 16m 53s | Avg:  4m 13s | Max:  4m 23s
      🟩 Clang16            Pass: 100%/4   | Total: 16m 44s | Avg:  4m 11s | Max:  4m 22s
      🟩 Clang17            Pass: 100%/18  | Total:  3h 12m | Avg: 10m 41s | Max: 26m 32s
      🟩 GCC6               Pass: 100%/2   | Total:  6m 55s | Avg:  3m 27s | Max:  3m 53s
      🟩 GCC7               Pass: 100%/6   | Total: 21m 18s | Avg:  3m 33s | Max:  4m 00s
      🟩 GCC8               Pass: 100%/6   | Total: 21m 47s | Avg:  3m 37s | Max:  3m 58s
      🟩 GCC9               Pass: 100%/6   | Total: 22m 51s | Avg:  3m 48s | Max:  4m 26s
      🟩 GCC10              Pass: 100%/4   | Total: 16m 48s | Avg:  4m 12s | Max:  4m 47s
      🟩 GCC11              Pass: 100%/7   | Total:  1h 00m | Avg:  8m 41s | Max: 37m 39s
      🟩 GCC12              Pass: 100%/4   | Total: 17m 32s | Avg:  4m 23s | Max:  4m 36s
      🟩 GCC13              Pass: 100%/20  | Total:  3h 26m | Avg: 10m 19s | Max: 24m 46s
      🟩 Intel2023.2.0      Pass: 100%/3   | Total: 15m 55s | Avg:  5m 18s | Max:  5m 50s
      🟩 MSVC14.16          Pass: 100%/1   | Total: 56m 14s | Avg: 56m 14s | Max: 56m 14s | Hits:  43%/1453  
      🟩 MSVC14.29          Pass: 100%/2   | Total:  2h 07m | Avg:  1h 03m | Max:  1h 06m | Hits:  15%/2906  
      🟩 MSVC14.39          Pass: 100%/6   | Total:  4h 14m | Avg: 42m 20s | Max:  1h 05m | Hits:  64%/8718  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/51  | Total:  5h 31m | Avg:  6m 29s | Max: 26m 32s
      🟩 GCC                Pass: 100%/55  | Total:  6h 14m | Avg:  6m 48s | Max: 37m 39s
      🟩 Intel              Pass: 100%/3   | Total: 15m 55s | Avg:  5m 18s | Max:  5m 50s
      🟩 MSVC               Pass: 100%/9   | Total:  7h 17m | Avg: 48m 37s | Max:  1h 06m | Hits:  51%/13077 
    🟩 gpu
      🟩 v100               Pass: 100%/118 | Total: 19h 19m | Avg:  9m 49s | Max:  1h 06m | Hits:  51%/13077 
    🟩 jobs
      🟩 Build              Pass: 100%/99  | Total: 15h 38m | Avg:  9m 29s | Max:  1h 06m | Hits:  27%/8718  
      🟩 TestCPU            Pass: 100%/11  | Total:  1h 53m | Avg: 10m 19s | Max: 21m 03s | Hits:  99%/4359  
      🟩 TestGPU            Pass: 100%/8   | Total:  1h 46m | Avg: 13m 20s | Max: 15m 09s
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total: 11m 33s | Avg:  3m 51s | Max:  4m 02s
      🟩 90a                Pass: 100%/4   | Total: 14m 00s | Avg:  3m 30s | Max:  3m 45s
    🟩 std
      🟩 11                 Pass: 100%/30  | Total:  2h 48m | Avg:  5m 36s | Max: 18m 39s
      🟩 14                 Pass: 100%/34  | Total:  6h 28m | Avg: 11m 25s | Max:  1h 03m | Hits:  42%/5812  
      🟩 17                 Pass: 100%/33  | Total:  5h 41m | Avg: 10m 20s | Max:  1h 06m | Hits:  58%/4359  
      🟩 20                 Pass: 100%/21  | Total:  4h 21m | Avg: 12m 26s | Max:  1h 03m | Hits:  58%/2906  
    
  • 🟩 libcudacxx: Pass: 100%/112 | Total: 1d 12h | Avg: 19m 46s | Max: 1h 05m | Hits: 36%/16935

    🟩 cpu
      🟩 amd64              Pass: 100%/104 | Total:  1d 10h | Avg: 20m 04s | Max:  1h 05m | Hits:  36%/16935 
      🟩 arm64              Pass: 100%/8   | Total:  2h 07m | Avg: 15m 52s | Max: 19m 29s
    🟩 ctk
      🟩 11.1               Pass: 100%/15  | Total:  3h 59m | Avg: 15m 58s | Max: 40m 19s | Hits:  40%/2630  
      🟩 11.8               Pass: 100%/3   | Total: 59m 42s | Avg: 19m 54s | Max: 20m 46s
      🟩 12.5               Pass: 100%/94  | Total:  1d 07h | Avg: 20m 22s | Max:  1h 05m | Hits:  36%/14305 
    🟩 cudacxx
      🟩 ClangCUDA17        Pass: 100%/2   | Total: 36m 04s | Avg: 18m 02s | Max: 19m 02s
      🟩 nvcc11.1           Pass: 100%/15  | Total:  3h 59m | Avg: 15m 58s | Max: 40m 19s | Hits:  40%/2630  
      🟩 nvcc11.8           Pass: 100%/3   | Total: 59m 42s | Avg: 19m 54s | Max: 20m 46s
      🟩 nvcc12.5           Pass: 100%/92  | Total:  1d 07h | Avg: 20m 25s | Max:  1h 05m | Hits:  36%/14305 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 36m 04s | Avg: 18m 02s | Max: 19m 02s
      🟩 nvcc               Pass: 100%/110 | Total:  1d 12h | Avg: 19m 48s | Max:  1h 05m | Hits:  36%/16935 
    🟩 cxx
      🟩 Clang9             Pass: 100%/6   | Total:  1h 46m | Avg: 17m 46s | Max: 30m 14s
      🟩 Clang10            Pass: 100%/3   | Total:  1h 05m | Avg: 21m 48s | Max: 25m 11s
      🟩 Clang11            Pass: 100%/4   | Total:  1h 18m | Avg: 19m 33s | Max: 21m 58s
      🟩 Clang12            Pass: 100%/4   | Total:  1h 03m | Avg: 15m 46s | Max: 20m 52s
      🟩 Clang13            Pass: 100%/4   | Total:  1h 00m | Avg: 15m 13s | Max: 19m 25s
      🟩 Clang14            Pass: 100%/4   | Total:  1h 18m | Avg: 19m 30s | Max: 20m 39s
      🟩 Clang15            Pass: 100%/4   | Total:  1h 20m | Avg: 20m 11s | Max: 23m 15s
      🟩 Clang16            Pass: 100%/4   | Total:  1h 18m | Avg: 19m 38s | Max: 21m 51s
      🟩 Clang17            Pass: 100%/14  | Total:  4h 40m | Avg: 20m 00s | Max: 46m 19s
      🟩 GCC6               Pass: 100%/2   | Total: 15m 55s | Avg:  7m 57s | Max: 13m 41s
      🟩 GCC7               Pass: 100%/6   | Total:  2h 05m | Avg: 20m 55s | Max: 40m 19s
      🟩 GCC8               Pass: 100%/6   | Total:  1h 27m | Avg: 14m 37s | Max: 20m 14s
      🟩 GCC9               Pass: 100%/6   | Total:  1h 31m | Avg: 15m 13s | Max: 21m 10s
      🟩 GCC10              Pass: 100%/4   | Total:  1h 18m | Avg: 19m 39s | Max: 21m 48s
      🟩 GCC11              Pass: 100%/7   | Total:  1h 42m | Avg: 14m 39s | Max: 20m 46s
      🟩 GCC12              Pass: 100%/4   | Total:  1h 18m | Avg: 19m 33s | Max: 21m 53s
      🟩 GCC13              Pass: 100%/21  | Total:  8h 02m | Avg: 22m 58s | Max:  1h 05m
      🟩 Intel2023.2.0      Pass: 100%/3   | Total:  1h 04m | Avg: 21m 33s | Max: 23m 06s
      🟩 MSVC14.16          Pass: 100%/1   | Total: 29m 57s | Avg: 29m 57s | Max: 29m 57s | Hits:  40%/2630  
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 03m | Avg: 31m 52s | Max: 33m 44s | Hits:  40%/5622  
      🟩 MSVC14.39          Pass: 100%/3   | Total:  1h 41m | Avg: 33m 54s | Max: 40m 20s | Hits:  33%/8683  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/47  | Total: 14h 51m | Avg: 18m 58s | Max: 46m 19s
      🟩 GCC                Pass: 100%/56  | Total: 17h 42m | Avg: 18m 58s | Max:  1h 05m
      🟩 Intel              Pass: 100%/3   | Total:  1h 04m | Avg: 21m 33s | Max: 23m 06s
      🟩 MSVC               Pass: 100%/6   | Total:  3h 15m | Avg: 32m 34s | Max: 40m 20s | Hits:  36%/16935 
    🟩 gpu
      🟩 v100               Pass: 100%/112 | Total:  1d 12h | Avg: 19m 46s | Max:  1h 05m | Hits:  36%/16935 
    🟩 jobs
      🟩 Build              Pass: 100%/99  | Total:  1d 05h | Avg: 17m 52s | Max: 40m 20s | Hits:  36%/16935 
      🟩 NVRTC              Pass: 100%/4   | Total:  1h 55m | Avg: 28m 51s | Max: 36m 11s
      🟩 Test               Pass: 100%/8   | Total:  5h 27m | Avg: 40m 58s | Max:  1h 05m
      🟩 VerifyCodegen      Pass: 100%/1   | Total:  1m 54s | Avg:  1m 54s | Max:  1m 54s
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total: 59m 42s | Avg: 19m 54s | Max: 20m 46s
      🟩 90a                Pass: 100%/4   | Total: 14m 00s | Avg:  3m 30s | Max:  3m 55s
    🟩 std
      🟩 11                 Pass: 100%/29  | Total:  9h 13m | Avg: 19m 05s | Max: 40m 19s
      🟩 14                 Pass: 100%/32  | Total: 10h 15m | Avg: 19m 14s | Max: 46m 19s | Hits:  40%/8092  
      🟩 17                 Pass: 100%/31  | Total: 10h 22m | Avg: 20m 05s | Max: 58m 35s | Hits:  30%/5782  
      🟩 20                 Pass: 100%/19  | Total:  7h 00m | Avg: 22m 06s | Max:  1h 05m | Hits:  38%/3061  
    
  • 🟩 pycuda: Pass: 100%/1 | Total: 11m 37s | Avg: 11m 37s | Max: 11m 37s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 11m 37s | Avg: 11m 37s | Max: 11m 37s
    🟩 ctk
      🟩 12.5               Pass: 100%/1   | Total: 11m 37s | Avg: 11m 37s | Max: 11m 37s
    🟩 cudacxx
      🟩 nvcc12.5           Pass: 100%/1   | Total: 11m 37s | Avg: 11m 37s | Max: 11m 37s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 11m 37s | Avg: 11m 37s | Max: 11m 37s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 11m 37s | Avg: 11m 37s | Max: 11m 37s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 11m 37s | Avg: 11m 37s | Max: 11m 37s
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total: 11m 37s | Avg: 11m 37s | Max: 11m 37s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 11m 37s | Avg: 11m 37s | Max: 11m 37s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
+/- libcu++
CUB
Thrust
+/- CUDA Experimental
pycuda

Modifications in project or dependencies?

Project
CCCL Infrastructure
+/- libcu++
+/- CUB
+/- Thrust
+/- CUDA Experimental
+/- pycuda

🏃‍ Runner counts (total jobs: 417)

# Runner
305 linux-amd64-cpu16
61 linux-amd64-gpu-v100-latest-1
28 linux-arm64-cpu16
23 windows-amd64-cpu16

@miscco miscco force-pushed the uninitialized_async_buffer branch from c301037 to 853f325 Compare August 30, 2024 06:43
Copy link
Contributor

🟩 CI finished in 1h 05m: Pass: 100%/55 | Total: 2h 57m | Avg: 3m 13s | Max: 12m 10s | Hits: 69%/126
  • 🟩 cudax: Pass: 100%/54 | Total: 2h 45m | Avg: 3m 03s | Max: 9m 48s | Hits: 69%/126

    🟩 cpu
      🟩 amd64              Pass: 100%/50  | Total:  2h 36m | Avg:  3m 07s | Max:  9m 48s | Hits:  69%/126   
      🟩 arm64              Pass: 100%/4   | Total:  9m 22s | Avg:  2m 20s | Max:  2m 33s
    🟩 ctk
      🟩 12.0               Pass: 100%/23  | Total:  1h 10m | Avg:  3m 04s | Max:  9m 24s | Hits:  68%/63    
      🟩 12.5               Pass: 100%/31  | Total:  1h 34m | Avg:  3m 03s | Max:  9m 48s | Hits:  71%/63    
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/23  | Total:  1h 10m | Avg:  3m 04s | Max:  9m 24s | Hits:  68%/63    
      🟩 nvcc12.5           Pass: 100%/31  | Total:  1h 34m | Avg:  3m 03s | Max:  9m 48s | Hits:  71%/63    
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/54  | Total:  2h 45m | Avg:  3m 03s | Max:  9m 48s | Hits:  69%/126   
    🟩 cxx
      🟩 Clang9             Pass: 100%/2   | Total:  5m 03s | Avg:  2m 31s | Max:  2m 32s
      🟩 Clang10            Pass: 100%/2   | Total:  5m 03s | Avg:  2m 31s | Max:  2m 36s
      🟩 Clang11            Pass: 100%/4   | Total: 10m 23s | Avg:  2m 35s | Max:  2m 48s
      🟩 Clang12            Pass: 100%/4   | Total: 10m 00s | Avg:  2m 30s | Max:  2m 47s
      🟩 Clang13            Pass: 100%/4   | Total: 10m 10s | Avg:  2m 32s | Max:  2m 49s
      🟩 Clang14            Pass: 100%/6   | Total: 18m 46s | Avg:  3m 07s | Max:  4m 28s
      🟩 Clang15            Pass: 100%/2   | Total:  5m 13s | Avg:  2m 36s | Max:  2m 48s
      🟩 Clang16            Pass: 100%/6   | Total: 19m 50s | Avg:  3m 18s | Max:  5m 29s
      🟩 GCC9               Pass: 100%/2   | Total:  4m 46s | Avg:  2m 23s | Max:  2m 34s
      🟩 GCC10              Pass: 100%/4   | Total:  9m 31s | Avg:  2m 22s | Max:  2m 31s
      🟩 GCC11              Pass: 100%/4   | Total:  9m 32s | Avg:  2m 23s | Max:  2m 52s
      🟩 GCC12              Pass: 100%/12  | Total: 38m 06s | Avg:  3m 10s | Max:  5m 40s
      🟩 MSVC14.36          Pass: 100%/1   | Total:  9m 24s | Avg:  9m 24s | Max:  9m 24s | Hits:  68%/63    
      🟩 MSVC14.39          Pass: 100%/1   | Total:  9m 48s | Avg:  9m 48s | Max:  9m 48s | Hits:  71%/63    
    🟩 cxx_family
      🟩 Clang              Pass: 100%/30  | Total:  1h 24m | Avg:  2m 48s | Max:  5m 29s
      🟩 GCC                Pass: 100%/22  | Total:  1h 01m | Avg:  2m 48s | Max:  5m 40s
      🟩 MSVC               Pass: 100%/2   | Total: 19m 12s | Avg:  9m 36s | Max:  9m 48s | Hits:  69%/126   
    🟩 gpu
      🟩 v100               Pass: 100%/54  | Total:  2h 45m | Avg:  3m 03s | Max:  9m 48s | Hits:  69%/126   
    🟩 jobs
      🟩 Build              Pass: 100%/46  | Total:  2h 08m | Avg:  2m 47s | Max:  9m 48s | Hits:  69%/126   
      🟩 Test               Pass: 100%/8   | Total: 37m 19s | Avg:  4m 39s | Max:  5m 40s
    🟩 sm
      🟩 90                 Pass: 100%/1   | Total:  2m 15s | Avg:  2m 15s | Max:  2m 15s
      🟩 90a                Pass: 100%/1   | Total:  2m 21s | Avg:  2m 21s | Max:  2m 21s
    🟩 std
      🟩 17                 Pass: 100%/30  | Total:  1h 23m | Avg:  2m 47s | Max:  5m 40s
      🟩 20                 Pass: 100%/24  | Total:  1h 22m | Avg:  3m 25s | Max:  9m 48s | Hits:  69%/126   
    
  • 🟩 pycuda: Pass: 100%/1 | Total: 12m 10s | Avg: 12m 10s | Max: 12m 10s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 12m 10s | Avg: 12m 10s | Max: 12m 10s
    🟩 ctk
      🟩 12.5               Pass: 100%/1   | Total: 12m 10s | Avg: 12m 10s | Max: 12m 10s
    🟩 cudacxx
      🟩 nvcc12.5           Pass: 100%/1   | Total: 12m 10s | Avg: 12m 10s | Max: 12m 10s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 12m 10s | Avg: 12m 10s | Max: 12m 10s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 12m 10s | Avg: 12m 10s | Max: 12m 10s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 12m 10s | Avg: 12m 10s | Max: 12m 10s
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total: 12m 10s | Avg: 12m 10s | Max: 12m 10s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 12m 10s | Avg: 12m 10s | Max: 12m 10s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
pycuda

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
+/- pycuda

🏃‍ Runner counts (total jobs: 55)

# Runner
40 linux-amd64-cpu16
9 linux-amd64-gpu-v100-latest-1
4 linux-arm64-cpu16
2 windows-amd64-cpu16

@miscco miscco force-pushed the uninitialized_async_buffer branch from 853f325 to d4d8247 Compare August 30, 2024 09:28
Copy link
Contributor

🟩 CI finished in 2h 49m: Pass: 100%/55 | Total: 3h 00m | Avg: 3m 16s | Max: 11m 49s | Hits: 71%/126
  • 🟩 cudax: Pass: 100%/54 | Total: 2h 48m | Avg: 3m 07s | Max: 10m 47s | Hits: 71%/126

    🟩 cpu
      🟩 amd64              Pass: 100%/50  | Total:  2h 38m | Avg:  3m 10s | Max: 10m 47s | Hits:  71%/126   
      🟩 arm64              Pass: 100%/4   | Total:  9m 40s | Avg:  2m 25s | Max:  2m 42s
    🟩 ctk
      🟩 12.0               Pass: 100%/23  | Total:  1h 13m | Avg:  3m 10s | Max:  9m 05s | Hits:  71%/63    
      🟩 12.5               Pass: 100%/31  | Total:  1h 35m | Avg:  3m 04s | Max: 10m 47s | Hits:  71%/63    
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/23  | Total:  1h 13m | Avg:  3m 10s | Max:  9m 05s | Hits:  71%/63    
      🟩 nvcc12.5           Pass: 100%/31  | Total:  1h 35m | Avg:  3m 04s | Max: 10m 47s | Hits:  71%/63    
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/54  | Total:  2h 48m | Avg:  3m 07s | Max: 10m 47s | Hits:  71%/126   
    🟩 cxx
      🟩 Clang9             Pass: 100%/2   | Total:  5m 35s | Avg:  2m 47s | Max:  3m 05s
      🟩 Clang10            Pass: 100%/2   | Total:  5m 01s | Avg:  2m 30s | Max:  2m 35s
      🟩 Clang11            Pass: 100%/4   | Total: 10m 05s | Avg:  2m 31s | Max:  2m 58s
      🟩 Clang12            Pass: 100%/4   | Total:  9m 56s | Avg:  2m 29s | Max:  2m 44s
      🟩 Clang13            Pass: 100%/4   | Total: 10m 06s | Avg:  2m 31s | Max:  2m 38s
      🟩 Clang14            Pass: 100%/6   | Total: 19m 01s | Avg:  3m 10s | Max:  5m 11s
      🟩 Clang15            Pass: 100%/2   | Total:  4m 57s | Avg:  2m 28s | Max:  2m 30s
      🟩 Clang16            Pass: 100%/6   | Total: 20m 04s | Avg:  3m 20s | Max:  5m 11s
      🟩 GCC9               Pass: 100%/2   | Total:  4m 38s | Avg:  2m 19s | Max:  2m 25s
      🟩 GCC10              Pass: 100%/4   | Total:  9m 38s | Avg:  2m 24s | Max:  2m 26s
      🟩 GCC11              Pass: 100%/4   | Total: 10m 52s | Avg:  2m 43s | Max:  3m 40s
      🟩 GCC12              Pass: 100%/12  | Total: 38m 41s | Avg:  3m 13s | Max:  5m 53s
      🟩 MSVC14.36          Pass: 100%/1   | Total:  9m 05s | Avg:  9m 05s | Max:  9m 05s | Hits:  71%/63    
      🟩 MSVC14.39          Pass: 100%/1   | Total: 10m 47s | Avg: 10m 47s | Max: 10m 47s | Hits:  71%/63    
    🟩 cxx_family
      🟩 Clang              Pass: 100%/30  | Total:  1h 24m | Avg:  2m 49s | Max:  5m 11s
      🟩 GCC                Pass: 100%/22  | Total:  1h 03m | Avg:  2m 54s | Max:  5m 53s
      🟩 MSVC               Pass: 100%/2   | Total: 19m 52s | Avg:  9m 56s | Max: 10m 47s | Hits:  71%/126   
    🟩 gpu
      🟩 v100               Pass: 100%/54  | Total:  2h 48m | Avg:  3m 07s | Max: 10m 47s | Hits:  71%/126   
    🟩 jobs
      🟩 Build              Pass: 100%/46  | Total:  2h 09m | Avg:  2m 48s | Max: 10m 47s | Hits:  71%/126   
      🟩 Test               Pass: 100%/8   | Total: 39m 07s | Avg:  4m 53s | Max:  5m 53s
    🟩 sm
      🟩 90                 Pass: 100%/1   | Total:  2m 01s | Avg:  2m 01s | Max:  2m 01s
      🟩 90a                Pass: 100%/1   | Total:  2m 21s | Avg:  2m 21s | Max:  2m 21s
    🟩 std
      🟩 17                 Pass: 100%/30  | Total:  1h 22m | Avg:  2m 45s | Max:  5m 53s
      🟩 20                 Pass: 100%/24  | Total:  1h 25m | Avg:  3m 34s | Max: 10m 47s | Hits:  71%/126   
    
  • 🟩 pycuda: Pass: 100%/1 | Total: 11m 49s | Avg: 11m 49s | Max: 11m 49s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 11m 49s | Avg: 11m 49s | Max: 11m 49s
    🟩 ctk
      🟩 12.5               Pass: 100%/1   | Total: 11m 49s | Avg: 11m 49s | Max: 11m 49s
    🟩 cudacxx
      🟩 nvcc12.5           Pass: 100%/1   | Total: 11m 49s | Avg: 11m 49s | Max: 11m 49s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 11m 49s | Avg: 11m 49s | Max: 11m 49s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 11m 49s | Avg: 11m 49s | Max: 11m 49s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 11m 49s | Avg: 11m 49s | Max: 11m 49s
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total: 11m 49s | Avg: 11m 49s | Max: 11m 49s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 11m 49s | Avg: 11m 49s | Max: 11m 49s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
pycuda

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
+/- pycuda

🏃‍ Runner counts (total jobs: 55)

# Runner
40 linux-amd64-cpu16
9 linux-amd64-gpu-v100-latest-1
4 linux-arm64-cpu16
2 windows-amd64-cpu16

@miscco miscco force-pushed the uninitialized_async_buffer branch from d4d8247 to f5b852d Compare September 9, 2024 12:02
Copy link
Contributor

github-actions bot commented Sep 9, 2024

🟩 CI finished in 1h 33m: Pass: 100%/54 | Total: 2h 39m | Avg: 2m 56s | Max: 9m 08s | Hits: 80%/206
  • 🟩 cudax: Pass: 100%/54 | Total: 2h 39m | Avg: 2m 56s | Max: 9m 08s | Hits: 80%/206

    🟩 cpu
      🟩 amd64              Pass: 100%/50  | Total:  2h 29m | Avg:  2m 59s | Max:  9m 08s | Hits:  80%/206   
      🟩 arm64              Pass: 100%/4   | Total:  9m 13s | Avg:  2m 18s | Max:  2m 29s
    🟩 ctk
      🟩 12.0               Pass: 100%/23  | Total:  1h 08m | Avg:  2m 57s | Max:  7m 29s | Hits:  80%/103   
      🟩 12.5               Pass: 100%/31  | Total:  1h 31m | Avg:  2m 56s | Max:  9m 08s | Hits:  80%/103   
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/23  | Total:  1h 08m | Avg:  2m 57s | Max:  7m 29s | Hits:  80%/103   
      🟩 nvcc12.5           Pass: 100%/31  | Total:  1h 31m | Avg:  2m 56s | Max:  9m 08s | Hits:  80%/103   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/54  | Total:  2h 39m | Avg:  2m 56s | Max:  9m 08s | Hits:  80%/206   
    🟩 cxx
      🟩 Clang9             Pass: 100%/2   | Total:  5m 13s | Avg:  2m 36s | Max:  2m 37s
      🟩 Clang10            Pass: 100%/2   | Total:  5m 04s | Avg:  2m 32s | Max:  2m 39s
      🟩 Clang11            Pass: 100%/4   | Total: 10m 10s | Avg:  2m 32s | Max:  2m 43s
      🟩 Clang12            Pass: 100%/4   | Total: 10m 19s | Avg:  2m 34s | Max:  2m 49s
      🟩 Clang13            Pass: 100%/4   | Total:  9m 41s | Avg:  2m 25s | Max:  2m 35s
      🟩 Clang14            Pass: 100%/6   | Total: 18m 08s | Avg:  3m 01s | Max:  4m 28s
      🟩 Clang15            Pass: 100%/2   | Total:  5m 13s | Avg:  2m 36s | Max:  2m 44s
      🟩 Clang16            Pass: 100%/6   | Total: 18m 33s | Avg:  3m 05s | Max:  4m 19s
      🟩 GCC9               Pass: 100%/2   | Total:  5m 13s | Avg:  2m 36s | Max:  2m 45s
      🟩 GCC10              Pass: 100%/4   | Total: 10m 14s | Avg:  2m 33s | Max:  3m 01s
      🟩 GCC11              Pass: 100%/4   | Total:  9m 57s | Avg:  2m 29s | Max:  2m 42s
      🟩 GCC12              Pass: 100%/12  | Total: 34m 47s | Avg:  2m 53s | Max:  4m 32s
      🟩 MSVC14.36          Pass: 100%/1   | Total:  7m 29s | Avg:  7m 29s | Max:  7m 29s | Hits:  80%/103   
      🟩 MSVC14.39          Pass: 100%/1   | Total:  9m 08s | Avg:  9m 08s | Max:  9m 08s | Hits:  80%/103   
    🟩 cxx_family
      🟩 Clang              Pass: 100%/30  | Total:  1h 22m | Avg:  2m 44s | Max:  4m 28s
      🟩 GCC                Pass: 100%/22  | Total:  1h 00m | Avg:  2m 44s | Max:  4m 32s
      🟩 MSVC               Pass: 100%/2   | Total: 16m 37s | Avg:  8m 18s | Max:  9m 08s | Hits:  80%/206   
    🟩 gpu
      🟩 v100               Pass: 100%/54  | Total:  2h 39m | Avg:  2m 56s | Max:  9m 08s | Hits:  80%/206   
    🟩 jobs
      🟩 Build              Pass: 100%/46  | Total:  2h 06m | Avg:  2m 44s | Max:  9m 08s | Hits:  80%/206   
      🟩 Test               Pass: 100%/8   | Total: 33m 06s | Avg:  4m 08s | Max:  4m 32s
    🟩 sm
      🟩 90                 Pass: 100%/1   | Total:  1m 59s | Avg:  1m 59s | Max:  1m 59s
      🟩 90a                Pass: 100%/1   | Total:  2m 04s | Avg:  2m 04s | Max:  2m 04s
    🟩 std
      🟩 17                 Pass: 100%/30  | Total:  1h 21m | Avg:  2m 42s | Max:  4m 28s
      🟩 20                 Pass: 100%/24  | Total:  1h 17m | Avg:  3m 14s | Max:  9m 08s | Hits:  80%/206   
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
pycuda
CUDA C Core Library

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
pycuda
CUDA C Core Library

🏃‍ Runner counts (total jobs: 54)

# Runner
40 linux-amd64-cpu16
8 linux-amd64-gpu-v100-latest-1
4 linux-arm64-cpu16
2 windows-amd64-cpu16

@miscco miscco force-pushed the uninitialized_async_buffer branch from f5b852d to b6d93e8 Compare September 10, 2024 14:02
Copy link
Contributor

🟩 CI finished in 6h 43m: Pass: 100%/58 | Total: 2h 56m | Avg: 3m 02s | Max: 10m 05s | Hits: 82%/206
  • 🟩 cudax: Pass: 100%/58 | Total: 2h 56m | Avg: 3m 02s | Max: 10m 05s | Hits: 82%/206

    🟩 cpu
      🟩 amd64              Pass: 100%/54  | Total:  2h 45m | Avg:  3m 04s | Max: 10m 05s | Hits:  82%/206   
      🟩 arm64              Pass: 100%/4   | Total: 10m 48s | Avg:  2m 42s | Max:  2m 50s
    🟩 ctk
      🟩 12.0               Pass: 100%/23  | Total:  1h 13m | Avg:  3m 11s | Max: 10m 05s | Hits:  82%/103   
      🟩 12.6               Pass: 100%/35  | Total:  1h 43m | Avg:  2m 57s | Max:  7m 52s | Hits:  82%/103   
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/23  | Total:  1h 13m | Avg:  3m 11s | Max: 10m 05s | Hits:  82%/103   
      🟩 nvcc12.6           Pass: 100%/35  | Total:  1h 43m | Avg:  2m 57s | Max:  7m 52s | Hits:  82%/103   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/58  | Total:  2h 56m | Avg:  3m 02s | Max: 10m 05s | Hits:  82%/206   
    🟩 cxx
      🟩 Clang9             Pass: 100%/2   | Total:  5m 19s | Avg:  2m 39s | Max:  2m 44s
      🟩 Clang10            Pass: 100%/2   | Total:  4m 58s | Avg:  2m 29s | Max:  2m 30s
      🟩 Clang11            Pass: 100%/4   | Total: 10m 36s | Avg:  2m 39s | Max:  3m 20s
      🟩 Clang12            Pass: 100%/4   | Total:  9m 59s | Avg:  2m 29s | Max:  2m 36s
      🟩 Clang13            Pass: 100%/4   | Total: 10m 57s | Avg:  2m 44s | Max:  3m 14s
      🟩 Clang14            Pass: 100%/6   | Total: 17m 09s | Avg:  2m 51s | Max:  3m 50s
      🟩 Clang15            Pass: 100%/2   | Total:  5m 22s | Avg:  2m 41s | Max:  2m 53s
      🟩 Clang16            Pass: 100%/4   | Total: 10m 41s | Avg:  2m 40s | Max:  2m 50s
      🟩 Clang17            Pass: 100%/2   | Total:  5m 19s | Avg:  2m 39s | Max:  2m 47s
      🟩 Clang18            Pass: 100%/4   | Total: 13m 56s | Avg:  3m 29s | Max:  4m 30s
      🟩 GCC9               Pass: 100%/2   | Total:  4m 53s | Avg:  2m 26s | Max:  2m 38s
      🟩 GCC10              Pass: 100%/4   | Total: 10m 35s | Avg:  2m 38s | Max:  2m 58s
      🟩 GCC11              Pass: 100%/4   | Total: 11m 24s | Avg:  2m 51s | Max:  3m 13s
      🟩 GCC12              Pass: 100%/9   | Total: 29m 44s | Avg:  3m 18s | Max:  4m 57s
      🟩 GCC13              Pass: 100%/3   | Total:  7m 45s | Avg:  2m 35s | Max:  2m 45s
      🟩 MSVC14.36          Pass: 100%/1   | Total: 10m 05s | Avg: 10m 05s | Max: 10m 05s | Hits:  82%/103   
      🟩 MSVC14.39          Pass: 100%/1   | Total:  7m 52s | Avg:  7m 52s | Max:  7m 52s | Hits:  82%/103   
    🟩 cxx_family
      🟩 Clang              Pass: 100%/34  | Total:  1h 34m | Avg:  2m 46s | Max:  4m 30s
      🟩 GCC                Pass: 100%/22  | Total:  1h 04m | Avg:  2m 55s | Max:  4m 57s
      🟩 MSVC               Pass: 100%/2   | Total: 17m 57s | Avg:  8m 58s | Max: 10m 05s | Hits:  82%/206   
    🟩 gpu
      🟩 v100               Pass: 100%/58  | Total:  2h 56m | Avg:  3m 02s | Max: 10m 05s | Hits:  82%/206   
    🟩 jobs
      🟩 Build              Pass: 100%/50  | Total:  2h 22m | Avg:  2m 51s | Max: 10m 05s | Hits:  82%/206   
      🟩 Test               Pass: 100%/8   | Total: 33m 42s | Avg:  4m 12s | Max:  4m 57s
    🟩 sm
      🟩 90                 Pass: 100%/1   | Total:  2m 24s | Avg:  2m 24s | Max:  2m 24s
      🟩 90a                Pass: 100%/1   | Total:  2m 33s | Avg:  2m 33s | Max:  2m 33s
    🟩 std
      🟩 17                 Pass: 100%/32  | Total:  1h 28m | Avg:  2m 46s | Max:  4m 57s
      🟩 20                 Pass: 100%/26  | Total:  1h 27m | Avg:  3m 22s | Max: 10m 05s | Hits:  82%/206   
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
pycuda
CUDA C Core Library

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
pycuda
CUDA C Core Library

🏃‍ Runner counts (total jobs: 58)

# Runner
44 linux-amd64-cpu16
8 linux-amd64-gpu-v100-latest-1
4 linux-arm64-cpu16
2 windows-amd64-cpu16

Copy link
Contributor

@harrism harrism left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, a few more doc fixes.

//! @brief Causes the buffer to be treated as a span when passed to cudax::launch.
//! @pre The buffer must have the cuda::mr::device_accessible property.
_CCCL_NODISCARD_FRIEND _CUDA_VSTD::span<_Tp>
__cudax_launch_transform(::cuda::stream_ref, uninitialized_async_buffer& __self) noexcept
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am unsure, in case the streams are different do we want to synchronize here or in a central place?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this could lead to unnecessary extra synchronization. You don't know if the stream the buffer was last allocated/written on needs to be synchronized. It may have been already.

E.g.

auto buf = buffer(size, stream_a);
launch(kernel, stream_a, buffer); // initialize buffer with computation in kernel (no sync)
stream_a.wait(); // sync stream_a

// Launch 4 instances of kernel to operate on 4 different buffers on 4 streams. 
// All kernels read `buf` as an input. 
// The suggested sync in `__cudax_launch_transform()` would synchronize all 4 streams before launching
// no streams need to be synced in this loop
for (int i = 0; i < 4; i++) {
  launch(kernel, streams[i], buffers[i], buf)
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeahy, but that is the same discussion as about lifetimes. We dont know whether a resource might go out of scope, so we need to do the safe thing

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a good example where the user of the buffer is unable to synchronize themselves? If one chooses to use an async buffer, they should be aware that they may need to do some synchronization. If we assume that the user doesn't know what they are doing then we don't give them the ability to hit SOL.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the protocol for __cudax_launch_transform is to return a wrapper here that can covert to a span. Then synchronize here and synchronize back in the destructor of that wrapper.

We also definitely need an opt-out of the synchronization, but not sure how it would look like. Something like cudax::skip_sync(buffer), we should try to come up with something generic for other similar cases.

Copy link
Contributor

🟨 CI finished in 2h 59m: Pass: 96%/58 | Total: 2h 43m | Avg: 2m 49s | Max: 6m 56s
  • 🟨 cudax: Pass: 96%/58 | Total: 2h 43m | Avg: 2m 49s | Max: 6m 56s

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  96%/54  | Total:  2h 33m | Avg:  2m 50s | Max:  6m 56s
      🟩 arm64              Pass: 100%/4   | Total: 10m 31s | Avg:  2m 37s | Max:  3m 37s
    🚨 cxx_family: MSVC 🚨
      🟩 Clang              Pass: 100%/34  | Total:  1h 33m | Avg:  2m 44s | Max:  4m 23s
      🟩 GCC                Pass: 100%/22  | Total: 57m 06s | Avg:  2m 35s | Max:  4m 29s
      🔥 MSVC               Pass:   0%/2   | Total: 13m 37s | Avg:  6m 48s | Max:  6m 56s
    🔍 jobs: Build 🔍
      🔍 Build              Pass:  96%/50  | Total:  2h 11m | Avg:  2m 37s | Max:  6m 56s
      🟩 Test               Pass: 100%/8   | Total: 32m 29s | Avg:  4m 03s | Max:  4m 29s
    🔍 std: 20 🔍
      🟩 17                 Pass: 100%/32  | Total:  1h 22m | Avg:  2m 34s | Max:  3m 57s
      🔍 20                 Pass:  92%/26  | Total:  1h 21m | Avg:  3m 07s | Max:  6m 56s
    🟨 cxx
      🟩 Clang9             Pass: 100%/2   | Total:  5m 11s | Avg:  2m 35s | Max:  2m 39s
      🟩 Clang10            Pass: 100%/2   | Total:  4m 49s | Avg:  2m 24s | Max:  2m 30s
      🟩 Clang11            Pass: 100%/4   | Total:  9m 51s | Avg:  2m 27s | Max:  2m 52s
      🟩 Clang12            Pass: 100%/4   | Total:  9m 39s | Avg:  2m 24s | Max:  2m 35s
      🟩 Clang13            Pass: 100%/4   | Total: 10m 27s | Avg:  2m 36s | Max:  2m 47s
      🟩 Clang14            Pass: 100%/6   | Total: 18m 27s | Avg:  3m 04s | Max:  4m 23s
      🟩 Clang15            Pass: 100%/2   | Total:  5m 04s | Avg:  2m 32s | Max:  2m 33s
      🟩 Clang16            Pass: 100%/4   | Total: 10m 54s | Avg:  2m 43s | Max:  3m 37s
      🟩 Clang17            Pass: 100%/2   | Total:  5m 08s | Avg:  2m 34s | Max:  2m 39s
      🟩 Clang18            Pass: 100%/4   | Total: 13m 33s | Avg:  3m 23s | Max:  4m 20s
      🟩 GCC9               Pass: 100%/2   | Total:  4m 25s | Avg:  2m 12s | Max:  2m 16s
      🟩 GCC10              Pass: 100%/4   | Total:  9m 13s | Avg:  2m 18s | Max:  2m 33s
      🟩 GCC11              Pass: 100%/4   | Total:  9m 19s | Avg:  2m 19s | Max:  2m 27s
      🟩 GCC12              Pass: 100%/9   | Total: 27m 18s | Avg:  3m 02s | Max:  4m 29s
      🟩 GCC13              Pass: 100%/3   | Total:  6m 51s | Avg:  2m 17s | Max:  2m 21s
      🟥 MSVC14.36          Pass:   0%/1   | Total:  6m 41s | Avg:  6m 41s | Max:  6m 41s
      🟥 MSVC14.39          Pass:   0%/1   | Total:  6m 56s | Avg:  6m 56s | Max:  6m 56s
    🟨 cudacxx_family
      🟨 nvcc               Pass:  96%/58  | Total:  2h 43m | Avg:  2m 49s | Max:  6m 56s
    🟨 gpu
      🟨 v100               Pass:  96%/58  | Total:  2h 43m | Avg:  2m 49s | Max:  6m 56s
    🟨 ctk
      🟨 12.0               Pass:  95%/23  | Total:  1h 05m | Avg:  2m 50s | Max:  6m 41s
      🟨 12.6               Pass:  97%/35  | Total:  1h 38m | Avg:  2m 48s | Max:  6m 56s
    🟨 cudacxx
      🟨 nvcc12.0           Pass:  95%/23  | Total:  1h 05m | Avg:  2m 50s | Max:  6m 41s
      🟨 nvcc12.6           Pass:  97%/35  | Total:  1h 38m | Avg:  2m 48s | Max:  6m 56s
    🟩 sm
      🟩 90                 Pass: 100%/1   | Total:  1m 58s | Avg:  1m 58s | Max:  1m 58s
      🟩 90a                Pass: 100%/1   | Total:  2m 17s | Avg:  2m 17s | Max:  2m 17s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
pycuda
CUDA C Core Library

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
pycuda
CUDA C Core Library

🏃‍ Runner counts (total jobs: 58)

# Runner
44 linux-amd64-cpu16
8 linux-amd64-gpu-v100-latest-1
4 linux-arm64-cpu16
2 windows-amd64-cpu16

Copy link
Contributor

🟩 CI finished in 52m 05s: Pass: 100%/58 | Total: 2h 52m | Avg: 2m 57s | Max: 8m 19s | Hits: 80%/208
  • 🟩 cudax: Pass: 100%/58 | Total: 2h 52m | Avg: 2m 57s | Max: 8m 19s | Hits: 80%/208

    🟩 cpu
      🟩 amd64              Pass: 100%/54  | Total:  2h 42m | Avg:  3m 00s | Max:  8m 19s | Hits:  80%/208   
      🟩 arm64              Pass: 100%/4   | Total:  9m 34s | Avg:  2m 23s | Max:  2m 25s
    🟩 ctk
      🟩 12.0               Pass: 100%/23  | Total:  1h 07m | Avg:  2m 57s | Max:  7m 28s | Hits:  80%/104   
      🟩 12.6               Pass: 100%/35  | Total:  1h 44m | Avg:  2m 58s | Max:  8m 19s | Hits:  80%/104   
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/23  | Total:  1h 07m | Avg:  2m 57s | Max:  7m 28s | Hits:  80%/104   
      🟩 nvcc12.6           Pass: 100%/35  | Total:  1h 44m | Avg:  2m 58s | Max:  8m 19s | Hits:  80%/104   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/58  | Total:  2h 52m | Avg:  2m 57s | Max:  8m 19s | Hits:  80%/208   
    🟩 cxx
      🟩 Clang9             Pass: 100%/2   | Total:  5m 00s | Avg:  2m 30s | Max:  2m 31s
      🟩 Clang10            Pass: 100%/2   | Total:  5m 20s | Avg:  2m 40s | Max:  2m 42s
      🟩 Clang11            Pass: 100%/4   | Total: 10m 09s | Avg:  2m 32s | Max:  2m 44s
      🟩 Clang12            Pass: 100%/4   | Total: 10m 15s | Avg:  2m 33s | Max:  2m 55s
      🟩 Clang13            Pass: 100%/4   | Total: 10m 06s | Avg:  2m 31s | Max:  2m 48s
      🟩 Clang14            Pass: 100%/6   | Total: 18m 18s | Avg:  3m 03s | Max:  4m 00s
      🟩 Clang15            Pass: 100%/2   | Total:  5m 10s | Avg:  2m 35s | Max:  2m 38s
      🟩 Clang16            Pass: 100%/4   | Total: 10m 22s | Avg:  2m 35s | Max:  3m 06s
      🟩 Clang17            Pass: 100%/2   | Total:  5m 37s | Avg:  2m 48s | Max:  3m 08s
      🟩 Clang18            Pass: 100%/4   | Total: 13m 44s | Avg:  3m 26s | Max:  4m 25s
      🟩 GCC9               Pass: 100%/2   | Total:  4m 51s | Avg:  2m 25s | Max:  2m 32s
      🟩 GCC10              Pass: 100%/4   | Total:  9m 46s | Avg:  2m 26s | Max:  2m 52s
      🟩 GCC11              Pass: 100%/4   | Total: 10m 45s | Avg:  2m 41s | Max:  3m 26s
      🟩 GCC12              Pass: 100%/9   | Total: 29m 52s | Avg:  3m 19s | Max:  6m 40s
      🟩 GCC13              Pass: 100%/3   | Total:  7m 00s | Avg:  2m 20s | Max:  2m 23s
      🟩 MSVC14.36          Pass: 100%/1   | Total:  7m 28s | Avg:  7m 28s | Max:  7m 28s | Hits:  80%/104   
      🟩 MSVC14.39          Pass: 100%/1   | Total:  8m 19s | Avg:  8m 19s | Max:  8m 19s | Hits:  80%/104   
    🟩 cxx_family
      🟩 Clang              Pass: 100%/34  | Total:  1h 34m | Avg:  2m 45s | Max:  4m 25s
      🟩 GCC                Pass: 100%/22  | Total:  1h 02m | Avg:  2m 49s | Max:  6m 40s
      🟩 MSVC               Pass: 100%/2   | Total: 15m 47s | Avg:  7m 53s | Max:  8m 19s | Hits:  80%/208   
    🟩 gpu
      🟩 v100               Pass: 100%/58  | Total:  2h 52m | Avg:  2m 57s | Max:  8m 19s | Hits:  80%/208   
    🟩 jobs
      🟩 Build              Pass: 100%/50  | Total:  2h 17m | Avg:  2m 45s | Max:  8m 19s | Hits:  80%/208   
      🟩 Test               Pass: 100%/8   | Total: 34m 27s | Avg:  4m 18s | Max:  6m 40s
    🟩 sm
      🟩 90                 Pass: 100%/1   | Total:  2m 04s | Avg:  2m 04s | Max:  2m 04s
      🟩 90a                Pass: 100%/1   | Total:  2m 15s | Avg:  2m 15s | Max:  2m 15s
    🟩 std
      🟩 17                 Pass: 100%/32  | Total:  1h 27m | Avg:  2m 44s | Max:  4m 25s
      🟩 20                 Pass: 100%/26  | Total:  1h 24m | Avg:  3m 14s | Max:  8m 19s | Hits:  80%/208   
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
pycuda
CUDA C Core Library

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
pycuda
CUDA C Core Library

🏃‍ Runner counts (total jobs: 58)

# Runner
44 linux-amd64-cpu16
8 linux-amd64-gpu-v100-latest-1
4 linux-arm64-cpu16
2 windows-amd64-cpu16

//! @brief Causes the buffer to be treated as a span when passed to cudax::launch.
//! @pre The buffer must have the cuda::mr::device_accessible property.
_CCCL_NODISCARD_FRIEND _CUDA_VSTD::span<_Tp>
__cudax_launch_transform(::cuda::stream_ref, uninitialized_async_buffer& __self) noexcept
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the protocol for __cudax_launch_transform is to return a wrapper here that can covert to a span. Then synchronize here and synchronize back in the destructor of that wrapper.

We also definitely need an opt-out of the synchronization, but not sure how it would look like. Something like cudax::skip_sync(buffer), we should try to come up with something generic for other similar cases.

miscco and others added 7 commits September 17, 2024 09:14
This uninitialized buffer provides a stream ordered allocation of N elements of type T utilitzing a cuda::mr::async_resource to allocate the storage.

The buffer takes care of alignment and deallocation of the storage. The user is required to ensure that the lifetime of the memory resource exceeds the lifetime of the buffer.
Co-authored-by: Mark Harris <[email protected]>
@miscco miscco force-pushed the uninitialized_async_buffer branch from a92af5d to 8779ce6 Compare September 17, 2024 07:22
Copy link
Contributor

🟩 CI finished in 4h 29m: Pass: 100%/58 | Total: 2h 38m | Avg: 2m 43s | Max: 13m 04s | Hits: 84%/208
  • 🟩 cudax: Pass: 100%/58 | Total: 2h 38m | Avg: 2m 43s | Max: 13m 04s | Hits: 84%/208

    🟩 cpu
      🟩 amd64              Pass: 100%/54  | Total:  2h 30m | Avg:  2m 47s | Max: 13m 04s | Hits:  84%/208   
      🟩 arm64              Pass: 100%/4   | Total:  7m 20s | Avg:  1m 50s | Max:  1m 56s
    🟩 ctk
      🟩 12.0               Pass: 100%/23  | Total:  1h 02m | Avg:  2m 44s | Max: 10m 50s | Hits:  84%/104   
      🟩 12.6               Pass: 100%/35  | Total:  1h 35m | Avg:  2m 43s | Max: 13m 04s | Hits:  84%/104   
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/23  | Total:  1h 02m | Avg:  2m 44s | Max: 10m 50s | Hits:  84%/104   
      🟩 nvcc12.6           Pass: 100%/35  | Total:  1h 35m | Avg:  2m 43s | Max: 13m 04s | Hits:  84%/104   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/58  | Total:  2h 38m | Avg:  2m 43s | Max: 13m 04s | Hits:  84%/208   
    🟩 cxx
      🟩 Clang9             Pass: 100%/2   | Total:  4m 39s | Avg:  2m 19s | Max:  2m 30s
      🟩 Clang10            Pass: 100%/2   | Total:  4m 21s | Avg:  2m 10s | Max:  2m 13s
      🟩 Clang11            Pass: 100%/4   | Total:  9m 05s | Avg:  2m 16s | Max:  2m 23s
      🟩 Clang12            Pass: 100%/4   | Total:  8m 31s | Avg:  2m 07s | Max:  2m 20s
      🟩 Clang13            Pass: 100%/4   | Total:  8m 44s | Avg:  2m 11s | Max:  2m 18s
      🟩 Clang14            Pass: 100%/6   | Total: 16m 22s | Avg:  2m 43s | Max:  4m 06s
      🟩 Clang15            Pass: 100%/2   | Total:  4m 28s | Avg:  2m 14s | Max:  2m 18s
      🟩 Clang16            Pass: 100%/4   | Total:  8m 22s | Avg:  2m 05s | Max:  2m 25s
      🟩 Clang17            Pass: 100%/2   | Total:  4m 26s | Avg:  2m 13s | Max:  2m 15s
      🟩 Clang18            Pass: 100%/4   | Total: 13m 08s | Avg:  3m 17s | Max:  4m 25s
      🟩 GCC9               Pass: 100%/2   | Total:  3m 54s | Avg:  1m 57s | Max:  2m 08s
      🟩 GCC10              Pass: 100%/4   | Total:  9m 09s | Avg:  2m 17s | Max:  3m 13s
      🟩 GCC11              Pass: 100%/4   | Total:  7m 40s | Avg:  1m 55s | Max:  2m 08s
      🟩 GCC12              Pass: 100%/9   | Total: 25m 54s | Avg:  2m 52s | Max:  4m 53s
      🟩 GCC13              Pass: 100%/3   | Total:  5m 32s | Avg:  1m 50s | Max:  1m 53s
      🟩 MSVC14.36          Pass: 100%/1   | Total: 10m 50s | Avg: 10m 50s | Max: 10m 50s | Hits:  84%/104   
      🟩 MSVC14.39          Pass: 100%/1   | Total: 13m 04s | Avg: 13m 04s | Max: 13m 04s | Hits:  84%/104   
    🟩 cxx_family
      🟩 Clang              Pass: 100%/34  | Total:  1h 22m | Avg:  2m 24s | Max:  4m 25s
      🟩 GCC                Pass: 100%/22  | Total: 52m 09s | Avg:  2m 22s | Max:  4m 53s
      🟩 MSVC               Pass: 100%/2   | Total: 23m 54s | Avg: 11m 57s | Max: 13m 04s | Hits:  84%/208   
    🟩 gpu
      🟩 v100               Pass: 100%/58  | Total:  2h 38m | Avg:  2m 43s | Max: 13m 04s | Hits:  84%/208   
    🟩 jobs
      🟩 Build              Pass: 100%/50  | Total:  2h 06m | Avg:  2m 31s | Max: 13m 04s | Hits:  84%/208   
      🟩 Test               Pass: 100%/8   | Total: 32m 03s | Avg:  4m 00s | Max:  4m 53s
    🟩 sm
      🟩 90                 Pass: 100%/1   | Total:  1m 55s | Avg:  1m 55s | Max:  1m 55s
      🟩 90a                Pass: 100%/1   | Total:  1m 53s | Avg:  1m 53s | Max:  1m 53s
    🟩 std
      🟩 17                 Pass: 100%/32  | Total:  1h 15m | Avg:  2m 22s | Max:  4m 53s
      🟩 20                 Pass: 100%/26  | Total:  1h 22m | Avg:  3m 10s | Max: 13m 04s | Hits:  84%/208   
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
pycuda
CUDA C Core Library

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
pycuda
CUDA C Core Library

🏃‍ Runner counts (total jobs: 58)

# Runner
44 linux-amd64-cpu16
8 linux-amd64-gpu-v100-latest-1
4 linux-arm64-cpu16
2 windows-amd64-cpu16

@miscco miscco merged commit e3c2e2b into NVIDIA:main Sep 17, 2024
68 of 72 checks passed
@miscco miscco deleted the uninitialized_async_buffer branch September 17, 2024 17:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CUDA Next Feature intended for the Cuda Next experimental library feature request New feature or request.
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

5 participants