You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ncclIbGdrSupport reads out of bounds if none of the files in memory_peers_paths exist because it relies on a null entry at the end which is missing. This can cause a confusing crash on systems on a mainline Linux kernel.
/build/source/build/hipify/src/transport/net_ib.cc:592:12: runtime error: index 3 out of bounds for type 'const char *[3]'
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /build/source/build/hipify/src/transport/net_ib.cc:592:12
=================================================================
==4070438==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ff82df5b038 at pc 0x7fff7c15a3b5 bp 0x7ff8312e28f0 sp 0x7ff8312e28e8
READ of size 8 at 0x7ff82df5b038 thread T6
#0 0x7fff7c15a3b4 in ncclIbGdrSupport() /build/source/build/hipify/src/transport/net_ib.cc:592:12
#1 0x7fff7c15adfe in ncclIbGetProperties(int, ncclNetProperties_v8_t*) /build/source/build/hipify/src/transport/net_ib.cc:688:7
#2 0x7fff7bf38198 in ncclNetCheckDeviceVersion(ncclComm*, ncclNet_v8_t*, int) /build/source/build/hipify/src/net.cc:501:3
#3 0x7fff7bf388d5 in ncclNetInit(ncclComm*) /build/source/build/hipify/src/net.cc:562:24
#4 0x7fff7bf0b0ad in commAlloc(ncclComm*, ncclComm*, int, int) /build/source/build/hipify/src/init.cc:533:3
#5 0x7fff7bf00395 in ncclCommInitRankFunc(ncclAsyncJob*) /build/source/build/hipify/src/init.cc:2002:5
#6 0x7fff7bee824e in ncclAsyncJobMain(void*) /build/source/build/hipify/src/group.cc:67:17
#7 0x7ffff749f0d4 in asan_thread_start(void*) (/nix/store/7r6z6nb443psc1ghiyjlqmhwkll7wiia-clr-6.3.0/llvm/lib/linux/libclang_rt.asan-x86_64.so+0x9f0d4)
#8 0x7ffff69b0d01 in start_thread (/nix/store/pacbfvpzqz2mksby36awvbcn051zcji3-glibc-2.40-36/lib/libc.so.6+0x90d01) (BuildId: 2de6548b3bd2f2857c3c1d5f85e5e817ce2c4a7e)
#9 0x7ffff6a303ab in __GI___clone3 (/nix/store/pacbfvpzqz2mksby36awvbcn051zcji3-glibc-2.40-36/lib/libc.so.6+0x1103ab) (BuildId: 2de6548b3bd2f2857c3c1d5f85e5e817ce2c4a7e)
Address 0x7ff82df5b038 is located in stack of thread T6 at offset 56 in frame
#0 0x7fff7c159ae7 in ncclIbGdrSupport() /build/source/build/hipify/src/transport/net_ib.cc:568
This frame has 3 object(s):
[32, 56) 'memory_peers_paths' (line 587) <== Memory access at offset 56 overflows this variable
[96, 351) 'strValue' (line 603)
[416, 672) 'buf' (line 613)
LunNova
changed the title
[UB] memory_peers_paths is missing null terminator causing out of bound reads
[Issue]: [UB] memory_peers_paths is missing null terminator causing out of bound reads
Dec 18, 2024
ncclIbGdrSupport
reads out of bounds if none of the files inmemory_peers_paths
exist because it relies on a null entry at the end which is missing. This can cause a confusing crash on systems on a mainline Linux kernel.rccl/src/transport/net_ib.cc
Lines 587 to 601 in 648a58d
Related to #1454
The text was updated successfully, but these errors were encountered: