Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[Fix][Relax] Fix top-p/top-k sampling kernel
This PR fixes a typo in the samping kernel of top-p/top-k sampling op. Prior to this PR, the kernel has out-of-bound global memory access due to a miss when introducing `sample_indices` in apache#16675. The correctness pass did not reveal this issue by directly running the test or running through pytest. But actually, if we use compute-sanitizer from NVIDIA, it will report the illegal memory access: ``` > compute-sanitizer --tool memcheck --print-limit=5 --launch-timeout 3600 python tests/python/relax/test_frontend_nn_op.py ========= COMPUTE-SANITIZER ========= Invalid __global__ read of size 8 bytes ========= at 0x4e90 in get_index_from_sorted_kernel ========= by thread (7,0,0) in block (0,0,0) ========= Address 0x7fe35ac00238 is out of bounds ========= and is 9 bytes after the nearest allocation at 0x7fe35ac00200 of size 48 bytes ========= Saved host backtrace up to driver entry point at kernel launch time ... ```
- Loading branch information