cudaBindTexture returns cudaErrorInvalidValue if a memory block from a pool allocator is passed. #162

seunghwak · 2019-05-09T03:50:40Z

in cub/iterator/tex_ref_input_iterator.cuh

 73     /// And by unique ID
 74     template <int UNIQUE_ID>
 75     struct TexId
 76     {
...
 93         /// Bind texture
 94         static cudaError_t BindTexture(void *d_in, size_t &offset)
 95         {
 96             if (d_in)
 97             {
 98                 cudaChannelFormatDesc tex_desc = cudaCreateChannelDesc<TextureWord>();
 99                 ref.channelDesc = tex_desc;
100                 return (CubDebug(cudaBindTexture(&offset, ref, d_in)));
101             }
102 
103             return cudaSuccess;
104         }
...
127     };

cudaBindTexture returns cudaErrorInvalidValue if d_in points a memory block from a pool allocator (the allocated memory block is a sub-region of the much larger block in the pool allocator).

template < class T, int dim, enum cudaTextureReadMode readMode >
__host__ cudaError_t cudaBindTexture ( size_t* offset, const texture < T, dim, readMode > & tex, const void* devPtr, size_t size = UINT_MAX )

https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__HIGHLEVEL.html#group__CUDART__HIGHLEVEL_1gfaa25560127f9feb99cb5dd6bc4ce2dc

cudaBindTexture has size_t size = UINT_MAX as an input parameter. BindTexture does not provide the size value when calling cudaBindTexture so the default value of UINT_MAX is used. This works when devPtr is from cudaMalloc but fails when devPtr is from a pool allocator which allocates a much bigger chunk first and assigns sub-regions to memory allocation calls. The CUDA documentation does not explicitly state what happens when devPtr + size passes the boundary of the memory block, but based on experiments, I am guessing that this function truncates size if devPtr + size passes the memory block boundary.

size / sizeof(T) cannot exceed cudaDeviceProp::maxTexture1DLinear[0], and based on https://docs.nvidia.com/cuda/cuda-c-programming-guide/#compute-capabilities (Table 14), size / sizeof(T) cannot exceed 2^27. UINT_MAX is 2^32-1 which is larger than 2^27. So, I am assuming that cudaBindTexture relies on properly identifying the end of the memory block pointed by devPtr when size is omitted in function call. But this does not work properly for a pool allocator.

TexRefInputIterator's BindTexture has size_t bytes=size_t(-1) as an input parameter but does not use this input parameter. I think size needs to be passed to TexId's BindTexture (quoted above) and finally to cudaBindTexture to work properly with a pool allocator as well.

…f a memory block from a pool allocator is passed)

…ssues/162 is fixed.

…iring

seunghwak · 2019-05-13T17:28:51Z

And due to the fact that cudaBindTexture works only up to 2^27 elements in the currently available architectures, cub::DeviceSpmv::CsrMV returns cudaErrorInvalidValue if ValueT * d_vector_x has more than 2^27 elements while users may expect this to work till 2^31 -1 elements (INT_MAX).

alliepiper · 2020-10-21T17:18:36Z

Related, cudaBindTexture has been deprecated (#191) and needs to be replaced.

…f a memory block from a pool allocator is passed)

Fixes cub::DeviceSpmv issues #161 and #162

seunghwak added a commit to seunghwak/cub that referenced this issue May 9, 2019

fix issue NVIDIA#162 (cudaBindTexture returns cudaErrorInvalidValue i…

3749499

…f a memory block from a pool allocator is passed)

seunghwak referenced this issue in seunghwak/cugraph May 10, 2019

incorporate a temporary solution till https://github.com/NVlabs/cub/i…

e3399a2

…ssues/162 is fixed.

seunghwak referenced this issue in seunghwak/cugraph May 10, 2019

applied a fix for https://github.com/NVlabs/cub/issues/162 to cub_sem…

abb3dc8

…iring

seunghwak mentioned this issue May 10, 2019

Fixes Issues #161 and #162 #163

Merged

alliepiper added the type: bug: functional Does not work as intended. label Oct 21, 2020

alliepiper pushed a commit to seunghwak/cub that referenced this issue Jul 30, 2021

fix issue NVIDIA#162 (cudaBindTexture returns cudaErrorInvalidValue i…

01980f9

…f a memory block from a pool allocator is passed)

alliepiper mentioned this issue Jul 30, 2021

SPMV Invalid Configuration Argument #280

Closed

alliepiper added this to the 1.14.0 milestone Jul 30, 2021

alliepiper closed this as completed in #163 Jul 30, 2021

alliepiper added a commit that referenced this issue Jul 30, 2021

Merge pull request #163 from seunghwak/bug_ext_memory_pool

9fffdf3

Fixes cub::DeviceSpmv issues #161 and #162

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cudaBindTexture returns cudaErrorInvalidValue if a memory block from a pool allocator is passed. #162

cudaBindTexture returns cudaErrorInvalidValue if a memory block from a pool allocator is passed. #162

seunghwak commented May 9, 2019

seunghwak commented May 13, 2019 •

edited

Loading

alliepiper commented Oct 21, 2020

cudaBindTexture returns cudaErrorInvalidValue if a memory block from a pool allocator is passed. #162

cudaBindTexture returns cudaErrorInvalidValue if a memory block from a pool allocator is passed. #162

Comments

seunghwak commented May 9, 2019

seunghwak commented May 13, 2019 • edited Loading

alliepiper commented Oct 21, 2020

seunghwak commented May 13, 2019 •

edited

Loading