Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenBLAS related segfaults with Julia 1.7.0 on Linux #892

Closed
carstenbauer opened this issue Dec 2, 2021 · 2 comments
Closed

OpenBLAS related segfaults with Julia 1.7.0 on Linux #892

carstenbauer opened this issue Dec 2, 2021 · 2 comments

Comments

@carstenbauer
Copy link
Member

carstenbauer commented Dec 2, 2021

I could produce the following segfault with fresh Julia 1.7.0 installs on two different HPC clusters (redhat enterprise) and a regular desktop machine (ubuntu). Note that while on one machine it would already segfault for N=5 on a different machine I had to set N>20. So try varying this if you can't reproduce.

(Julia started with multiple threads, e.g. julia -t 8)

julia> n = 1000;

julia> N = 20;

julia> Threads.@threads for i in 1:N
           A = randn(Float64, n, n); inv(A);
       end

signal (11): Segmentation fault
in expression starting at REPL[2]:1
dgetrf_parallel at /cm/shared/apps/pc2/EB-SW/software/Julia/1.7.0-linux-x86_64/bin/../lib/julia/libopenblas64_.so (unknown line)
dgetrf_parallel at /cm/shared/apps/pc2/EB-SW/software/Julia/1.7.0-linux-x86_64/bin/../lib/julia/libopenblas64_.so (unknown line)
dgetrf_parallel at /cm/shared/apps/pc2/EB-SW/software/Julia/1.7.0-linux-x86_64/bin/../lib/julia/libopenblas64_.so (unknown line)
dgetrf_parallel at /cm/shared/apps/pc2/EB-SW/software/Julia/1.7.0-linux-x86_64/bin/../lib/julia/libopenblas64_.so (unknown line)
dgetrf_parallel at /cm/shared/apps/pc2/EB-SW/software/Julia/1.7.0-linux-x86_64/bin/../lib/julia/libopenblas64_.so (unknown line)
dgetrf_64_ at /cm/shared/apps/pc2/EB-SW/software/Julia/1.7.0-linux-x86_64/bin/../lib/julia/libopenblas64_.so (unknown line)
getrf! at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.7/LinearAlgebra/src/lapack.jl:575
#lu!#146 at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.7/LinearAlgebra/src/lu.jl:81 [inlined]
lu!##kw at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.7/LinearAlgebra/src/lu.jl:81 [inlined]
#lu#153 at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.7/LinearAlgebra/src/lu.jl:279 [inlined]
lu at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.7/LinearAlgebra/src/lu.jl:278 [inlined]
lu at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.7/LinearAlgebra/src/lu.jl:278 [inlined]
inv at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.7/LinearAlgebra/src/dense.jl:876
macro expansion at ./REPL[2]:2 [inlined]
JuliaLang/julia#40#threadsfor_fun at ./threadingconstructs.jl:85
JuliaLang/julia#40#threadsfor_fun at ./threadingconstructs.jl:52
unknown function (ip: 0x1554f0112d5f)
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2247 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2429
jl_apply at /buildworker/worker/package_linux64/build/src/julia.h:1788 [inlined]
start_task at /buildworker/worker/package_linux64/build/src/task.c:877
Allocations: 4321396 (Pool: 4319432; Big: 1964); GC: 5
Segmentation fault (core dumped)

The segfault disappears when one sets BLAS.set_num_threads(1).

Likely related to https://github.com/JuliaLang/julia/issues/43301 and #886 (caused by the same issue?). However, not really a duplicate because these issues are all about macOS and also Float64 seems to be fine there.

(Let me note that this is the result of me trying to create a MWE. Originally I encountered this segfault and StackOverflowErrors (as in the issues linked above) as part of CI testing a private package with 1.6.4 and 1.7.0. See https://discourse.julialang.org/t/inv-causes-stack-overflow-on-julia-1-7-0-and-mac-os/72411/10). When switching back to 1.6.3 or 1.7.0-rc1 the issues went away.)

@gbaraldi
Copy link
Member

gbaraldi commented Dec 2, 2021

Likely fixed by JuliaLang/julia#43300 since those are all similar issues. I get a StackOverFlow error but the new version will probably fix those issues.

@ViralBShah
Copy link
Member

Closing as a dup of those other issues discussed above. Can reopen if those don't fix this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants