-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
segfault on 2.2.0 in dgetrf on ubuntu x86_64 #2
Comments
also reproduces in version 2.0.0 |
fwiw, i get a segfault for any dimension >= 18, but not before |
Also get a failure in sgetrf for dim >= 10 |
Hi @dlwh, let me try to reproduce that locally, it's first time I see it. |
I can reproduce with OpenBLAS, but not with Intel MKL. I also can only reproduce if |
Here is what I'm observing. When calling When accessing the current thread's stack size and stack base, we can clearly see that this is indeed a stack overflow:
( Now, onto figuring out why Also, when setting [1]
|
@dlwh this issue is a repeat of a previously encountered issue with Breeze and In the meantime, the workarounds are the following:
I'm exploring the licensing implication of packaging a custom OpenBLAS in the library to avoid having to install it locally, similarly to numpy. That might be also be a longer term solution for this specific issue. |
Huh ok. Thanks! netlib-java stopped working on ubuntu 20.04 since they
stopped shipping gfortran3 and I didn't think to try
…On Thu, May 13, 2021 at 2:52 PM Ludovic Henry ***@***.***> wrote:
@dlwh <https://github.com/dlwh> this issue is a repeat of a previously
encountered issue with Breeze and netlib-java (so prior to my change). I
opened an issue on OpenBLAS.
In the meantime, the workarounds are the following:
- Increase the size of the stack of Java threads with -Xss10M (set the
Java threads' stack size to 10 Mbytes)
- Make sure OpenBLAS doesn't use the parallel implementation by
defining the environment variable OPENBLAS_NUM_THREADS=1
- Compile a custom version of OpenBLAS that unconditionally define
USE_ALLOC_HEAP at
https://github.com/xianyi/OpenBLAS/blob/develop/lapack/getrf/getrf_parallel.c#L49
I'm exploring the licensing implication of packaging a custom OpenBLAS in
the library to avoid having to install it locally, similarly to numpy. That
might be also be a longer term solution for this specific issue.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAACLIN74T4SCB2ZK6EOS53TNRCZHANCNFSM44WXNMGQ>
.
|
reproduces in OpenJDK 64-Bit Server VM, Java 1.8.0_292 and OpenJDK 64-Bit Server VM, Java 16.0.1
There aren't any debug symbols and I'm no expert on assembly, but this is what I'm getting. the first instruction is the segfault.
The text was updated successfully, but these errors were encountered: