-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix LAPACK non-orthogonality #842
Conversation
Nice. Thanks for fixing @antoine-levitt ! I wonder if we should avoid the re-orthogonalisation for Float64. A rough check with my recent timings suggests this can increase the cost of the RR by 10% in some cases (where the RR takes a few % of the total runtime). |
If you've got the timings ready, can you check the impact of replacing |
I'd really like to keep doing the "right thing" by default as far as possible. I also have (moderate) hope that this fix will reduce the annoying "hey, LOBPCG has decided we need 100 iterations to converge this now". If it's only a couple of percents of a couple of percents, let's not bother... We can also have a "degraded" faster setting that skips some orthogonalizations, but I'm not very comfortable with the tolerance fiddling... |
There's also zheevd which do not have an interface in LAPACK.jl. When I tested, it was the fastest for matrix size N > 100. I haven't tested the orthogonality, but it should be okay according to https://netlib.org/lapack/lawnspdf/lawn183.pdf
|
Thanks for the input, both. I'll look into the different solvers wrt. timings and stability a bit when I get the time. Regarding the current reorthogonalisation: My tests suggest it's indeed not something we need to worry (we're talking 100ms for calculations that take a few minutes). |
OK, so I tested all three : syev, syevr, syevd (thanks @jaemolihm for sending me the syevd wrapper!). syevd is actually the most orthogonal, always keeping the vectors orthogonal to 1e-5, and often much less than that. Syevr is atrocious, very often producing 1e-4, and sometimes 1e-3! Syev is in the middle, staying within 1e-5 but hitting it more often than syevd. On my test case, without additional orthogonalizations, syevr made the assert crash; syev made the LOBPCG report non-convergence; syevd was completely fine. The rayleigh-ritz step is negligible for small systems and only becomes significant for a large number of bands, where according to @jaemolihm syevd is the fastest. @jaemolihm said he's going to make a PR to julia with the syevd wrapper. So let's keep the ortho! in there, and when the PR to julia is in, use it when VERSION is adequate. In my tests a few years back (almost 10 years now!), ELPA (https://elpa.mpcdf.mpg.de/) was the fastest direct eigensolver around; I wonder if that's still true. No idea about accuracy though. |
Sounds good. @jaemolihm feel free to keep us posted :).
By the way the test failure could be an issue in Julia's generic linear algebra library, but I have not investigated any further. |
Will not break! Will not break! It broke. (https://www.youtube.com/watch?v=qioOho4_ar0)
This fixes the sanity check failing for Float32 that @mfherbst has been seeing.