-
-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stegr! call segfault #873
Comments
Hmm, my local reproducer turned out to have been OpenBLAS 0.3.13, not 0.3.17, where I'm having trouble reproducing. The rr traces we have are with 0.3.17 though, so it may not be as simple as just this call on AMD. |
Ok, looks like OpenBLAS is doing an uninitialized read from the work array and using that as an array index bound, which explains, why the reproduction here is sporadic (depends on what's in the work array). For better reproduction, fill the work array with deadbeef first.
|
This was originally reported as https://github.com/JuliaLang/julia/issues/42415. I've tracked this down to an our of bounds read on the following line: https://github.com/Reference-LAPACK/lapack/blob/44ecb6a5ff821b1cbb39f8cc2166cb098e060b4d/SRC/slarrv.f#L423 In the crashing example, `M` is `0`, causing `slarrv` to read uninitialized memory from the work array. I believe the `0` for `M` is correct and indeed, the documentation above supports that `M` may be zero: https://github.com/Reference-LAPACK/lapack/blob/44ecb6a5ff821b1cbb39f8cc2166cb098e060b4d/SRC/slarrv.f#L113-L116 I believe it may be sufficient to early-out this function as suggested in this PR. However, I have limited context for the full routine here, so I would appreciate a sanity check.
I have a proposed fix in Reference-LAPACK/lapack#625. @andreasnoack if you have more context for what this code does perhaps you could review upstream. |
This was originally reported as https://github.com/JuliaLang/julia/issues/42415. I've tracked this down to an our of bounds read on the following line: https://github.com/Reference-LAPACK/lapack/blob/44ecb6a5ff821b1cbb39f8cc2166cb098e060b4d/SRC/slarrv.f#L423 In the crashing example, `M` is `0`, causing `slarrv` to read uninitialized memory from the work array. I believe the `0` for `M` is correct and indeed, the documentation above supports that `M` may be zero: https://github.com/Reference-LAPACK/lapack/blob/44ecb6a5ff821b1cbb39f8cc2166cb098e060b4d/SRC/slarrv.f#L113-L116 I believe it may be sufficient to early-out this function as suggested in this PR. However, I have limited context for the full routine here, so I would appreciate a sanity check.
This was originally reported as https://github.com/JuliaLang/julia/issues/42415. I've tracked this down to an our of bounds read on the following line: https://github.com/Reference-LAPACK/lapack/blob/44ecb6a5ff821b1cbb39f8cc2166cb098e060b4d/SRC/slarrv.f#L423 In the crashing example, `M` is `0`, causing `slarrv` to read uninitialized memory from the work array. I believe the `0` for `M` is correct and indeed, the documentation above supports that `M` may be zero: https://github.com/Reference-LAPACK/lapack/blob/44ecb6a5ff821b1cbb39f8cc2166cb098e060b4d/SRC/slarrv.f#L113-L116 I believe it may be sufficient to early-out this function as suggested in this PR. However, I have limited context for the full routine here, so I would appreciate a sanity check.
This is fixed upstream now, if somebody wants to integrate the patch in our OpenBLAS build :) |
@Keno Do we know how long this bug has been in the upstream? Once we've integrated the patch into our build, do we need to backport this to 1.6 and/or 1.7? |
The particular utility function has had this bug since it's introduction, but I do not know whether changes upstream or to the tests would have caused this to show up more frequently. It should certainly be safe to backport. |
Okay, here is my attempt to add the patch to Ygg: JuliaPackaging/Yggdrasil#3708 |
…ang/julia#42415, ref Reference-LAPACK/lapack#625)
…ang/julia#42415, ref Reference-LAPACK/lapack#625)
…ang/julia#42415, ref Reference-LAPACK/lapack#625)
…ang/julia#42415, ref Reference-LAPACK/lapack#625)
…ang/julia#42415, ref Reference-LAPACK/lapack#625)
|
Please consider adding the security label, for CVE-2021-4048. |
the link is broken. |
It's a GitHub auto-generated link. |
@DilumAluthge alerted us to segfaults that were occurring on some machines during CI builds while running the LinearAlgebra/tridiag test. Using rr, we were able to trace these segfaults back to a call to
stegr!
in lapack.jl https://github.com/JuliaLang/julia/blob/44d484222005580432433b7889c4a56d25c0ea67/stdlib/LinearAlgebra/src/lapack.jl#L3827While we ran out of time to fully debug and fix this, we have a MWE that replicates the problem on AMD machines.
Here's the MWE:
CC: @Keno
The text was updated successfully, but these errors were encountered: