Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clBLAS-client --cpu gives CL_INVALID_COMMAND_QUEUE error on OS X #187

Open
GOFAI opened this issue Nov 15, 2015 · 6 comments
Open

clBLAS-client --cpu gives CL_INVALID_COMMAND_QUEUE error on OS X #187

GOFAI opened this issue Nov 15, 2015 · 6 comments

Comments

@GOFAI
Copy link

GOFAI commented Nov 15, 2015

I built clBLAS 2.8 on my 13" MacBook Pro (Retina, late 2014, 10.10.5) using a Homebrew formula I tweaked from the one in the homebrew-science tap. While everything seemed to install fine I get the following error when I try to check it with clBLAS-client:

JOHNNIAC:clBlas Walrus$ clBLAS-client --cpu
OpenCL error -36 on line 350 of /tmp/clblas20151115-2237-z21tyj/clBLAS-2.8/src/library/blas/xgemm.cc
Assertion failed: (false), function clblasGemm, file /tmp/clblas20151115-2237-z21tyj/clBLAS-2.8/src/library/blas/xgemm.cc, line 350.
Abort trap: 6

-36 is apparently CL_INVALID_COMMAND_QUEUE. Any idea what's going awry here?

@GOFAI
Copy link
Author

GOFAI commented Nov 15, 2015

clBLAS-client --gpu gives the exact same message, fwiw.

@hughperkins
Copy link
Contributor

(Just in case it's useful, invalid_command_queue usually means that the kernel read/write outside the bounds of an array.)

@GOFAI
Copy link
Author

GOFAI commented Nov 16, 2015

Line 350 of /tmp/clblas20151115-2237-z21tyj/clBLAS-2.8/src/library/blas/xgemm.cc:

err = clGetCommandQueueInfo( commandQueues[0], CL_QUEUE_DEVICE, sizeof(clDevice), &clDevice, NULL);
  CL_CHECK(err)

I suppose commandQueues[0] doesn't point to a valid command-queue. It also turns out that functions other than gemm get further along before also ending up with a CL_INVALID_COMMAND_QUEUE:

JOHNNIAC:glasstone Walrus$ clblas-client -f gemv
    StatisticalTimer:: Pruning 1 samples from clfunc
    StatisticalTimer:: Pruning 0 samples from clGemv
BLAS kernel execution time < ns >: 38210.5
BLAS kernel execution Gflops < 2.0*M*N/time >: 0.0857565
OPENCL_V_THROWERROR< CL_INVALID_COMMAND_QUEUE > (274): releasing command queue
libc++abi.dylib: terminating with uncaught exception of type std::runtime_error: OPENCL_V_THROWERROR< CL_INVALID_COMMAND_QUEUE > (274): releasing command queue
Abort trap: 6

@hughperkins
Copy link
Contributor

It's normally because you wrote off the end of an array. You can have a look at #108 for an example of how to debug this.

Edit: but basically the concept is:

  • put a return; right at the start of the kernel, verify that no commandqueue crash
  • move the return; forwards and backwards, until you find the first line that triggers the crash
  • then play around with commenting stuff out, or modifying things, until you can figure out exactly which lines are causing the crash :-)

Edit 2: recommend turning off opencl optimizations, there is a paragraph in the linked issue, giving an example of how to do this.

@GOFAI
Copy link
Author

GOFAI commented Nov 16, 2015

Thanks! I'm kind of hoping from the problems I'm having that it's not any single kernel, but rather a single problem that's affecting everything.

My impression from scrutinizing clfunc_common.hpp is that the "releasing command queue" message will only appear if the error occurred when the destructor was called on clblasfunc. (I get a variant of that message for all functions BUT gemm).

If that's the case, does it make sense to try to go through all the kernels inserting return;?

@hughperkins
Copy link
Contributor

My impression from scrutinizing clfunc_common.hpp is that the "releasing command queue" message will only appear if the error occurred when the destructor was called on clblasfunc. (I get a variant of that message for all functions BUT gemm).

Ah, interesting. Fair enough :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants