This repository has been archived by the owner on Mar 21, 2024. It is now read-only.
cudaBindTexture returns cudaErrorInvalidValue if a memory block from a pool allocator is passed. #162
Labels
type: bug: functional
Does not work as intended.
Milestone
in cub/iterator/tex_ref_input_iterator.cuh
cudaBindTexture returns cudaErrorInvalidValue if d_in points a memory block from a pool allocator (the allocated memory block is a sub-region of the much larger block in the pool allocator).
https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__HIGHLEVEL.html#group__CUDART__HIGHLEVEL_1gfaa25560127f9feb99cb5dd6bc4ce2dc
cudaBindTexture has size_t size = UINT_MAX as an input parameter. BindTexture does not provide the size value when calling cudaBindTexture so the default value of UINT_MAX is used. This works when devPtr is from cudaMalloc but fails when devPtr is from a pool allocator which allocates a much bigger chunk first and assigns sub-regions to memory allocation calls. The CUDA documentation does not explicitly state what happens when devPtr + size passes the boundary of the memory block, but based on experiments, I am guessing that this function truncates size if devPtr + size passes the memory block boundary.
size / sizeof(T) cannot exceed cudaDeviceProp::maxTexture1DLinear[0], and based on https://docs.nvidia.com/cuda/cuda-c-programming-guide/#compute-capabilities (Table 14), size / sizeof(T) cannot exceed 2^27. UINT_MAX is 2^32-1 which is larger than 2^27. So, I am assuming that cudaBindTexture relies on properly identifying the end of the memory block pointed by devPtr when size is omitted in function call. But this does not work properly for a pool allocator.
TexRefInputIterator's BindTexture has size_t bytes=size_t(-1) as an input parameter but does not use this input parameter. I think size needs to be passed to TexId's BindTexture (quoted above) and finally to cudaBindTexture to work properly with a pool allocator as well.
The text was updated successfully, but these errors were encountered: