Skip to content
This repository has been archived by the owner on Mar 21, 2024. It is now read-only.

cudaBindTexture returns cudaErrorInvalidValue if a memory block from a pool allocator is passed. #162

Closed
seunghwak opened this issue May 9, 2019 · 2 comments · Fixed by #163
Labels
type: bug: functional Does not work as intended.
Milestone

Comments

@seunghwak
Copy link
Contributor

in cub/iterator/tex_ref_input_iterator.cuh

 73     /// And by unique ID
 74     template <int UNIQUE_ID>
 75     struct TexId
 76     {
...
 93         /// Bind texture
 94         static cudaError_t BindTexture(void *d_in, size_t &offset)
 95         {
 96             if (d_in)
 97             {
 98                 cudaChannelFormatDesc tex_desc = cudaCreateChannelDesc<TextureWord>();
 99                 ref.channelDesc = tex_desc;
100                 return (CubDebug(cudaBindTexture(&offset, ref, d_in)));
101             }
102 
103             return cudaSuccess;
104         }
...
127     };

cudaBindTexture returns cudaErrorInvalidValue if d_in points a memory block from a pool allocator (the allocated memory block is a sub-region of the much larger block in the pool allocator).

template < class T, int dim, enum cudaTextureReadMode readMode >
__host__ ​cudaError_t cudaBindTexture ( size_t* offset, const texture < T, dim, readMode > & tex, const void* devPtr, size_t size = UINT_MAX )

https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__HIGHLEVEL.html#group__CUDART__HIGHLEVEL_1gfaa25560127f9feb99cb5dd6bc4ce2dc

cudaBindTexture has size_t size = UINT_MAX as an input parameter. BindTexture does not provide the size value when calling cudaBindTexture so the default value of UINT_MAX is used. This works when devPtr is from cudaMalloc but fails when devPtr is from a pool allocator which allocates a much bigger chunk first and assigns sub-regions to memory allocation calls. The CUDA documentation does not explicitly state what happens when devPtr + size passes the boundary of the memory block, but based on experiments, I am guessing that this function truncates size if devPtr + size passes the memory block boundary.

size / sizeof(T) cannot exceed cudaDeviceProp::maxTexture1DLinear[0], and based on https://docs.nvidia.com/cuda/cuda-c-programming-guide/#compute-capabilities (Table 14), size / sizeof(T) cannot exceed 2^27. UINT_MAX is 2^32-1 which is larger than 2^27. So, I am assuming that cudaBindTexture relies on properly identifying the end of the memory block pointed by devPtr when size is omitted in function call. But this does not work properly for a pool allocator.

TexRefInputIterator's BindTexture has size_t bytes=size_t(-1) as an input parameter but does not use this input parameter. I think size needs to be passed to TexId's BindTexture (quoted above) and finally to cudaBindTexture to work properly with a pool allocator as well.

seunghwak added a commit to seunghwak/cub that referenced this issue May 9, 2019
…f a memory block from a pool allocator is passed)
@seunghwak
Copy link
Contributor Author

seunghwak commented May 13, 2019

And due to the fact that cudaBindTexture works only up to 2^27 elements in the currently available architectures, cub::DeviceSpmv::CsrMV returns cudaErrorInvalidValue if ValueT * d_vector_x has more than 2^27 elements while users may expect this to work till 2^31 -1 elements (INT_MAX).

@alliepiper
Copy link
Collaborator

Related, cudaBindTexture has been deprecated (#191) and needs to be replaced.

@alliepiper alliepiper added the type: bug: functional Does not work as intended. label Oct 21, 2020
alliepiper pushed a commit to seunghwak/cub that referenced this issue Jul 30, 2021
…f a memory block from a pool allocator is passed)
@alliepiper alliepiper added this to the 1.14.0 milestone Jul 30, 2021
alliepiper added a commit that referenced this issue Jul 30, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
type: bug: functional Does not work as intended.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants