Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strange SymEigsShiftSolver behavior when true eigenvalue is close to zero #126

Open
LTLA opened this issue Sep 6, 2021 · 10 comments
Open

Comments

@LTLA
Copy link

LTLA commented Sep 6, 2021

I have an input matrix where one of the eigenvalues is very close to zero - in fact, I suspect its true value is actually zero - and I am trying to use the recommended shift-and-invert paradigm to identify the smallest eigenvalues. Unfortunately, it seems that this causes some problems with SymEigsShiftSolver. To illustrate, I'll repurpose the example from the link:

#include "Eigen/Core"
#include "Spectra/SymEigsShiftSolver.h"
// <Spectra/MatOp/DenseSymShiftSolve.h> is implicitly included
#include <iostream>
 
using namespace Spectra;
 
int main()
{
    // Create a diagonal matrix with a single off-diagonal close to 1; one eigenvalue is very close to zero.
    Eigen::MatrixXd M = Eigen::MatrixXd::Zero(10, 10);
    for (int i = 0; i < M.rows(); i++) {
        M(i, i) = 1;
    }
    M(1, 0) = 0.999999999999;
 
    // Construct matrix operation object using the wrapper class
    DenseSymShiftSolve<double> op(M);
 
    // Construct eigen solver object with shift 0
    // This will find eigenvalues that are closest to 0
    SymEigsShiftSolver<DenseSymShiftSolve<double>> eigs(op, 3, 6, 0.0);
 
    eigs.init();
    eigs.compute(SortRule::LargestMagn);
    if (eigs.info() == CompInfo::Successful)
    {
        Eigen::VectorXd evalues = eigs.eigenvalues();
        // Will get (3.0, 2.0, 1.0)
        std::cout << "Eigenvalues found:\n" << evalues << std::endl;
    }
 
    return 0;
}

Compiling this with Spectra 1.0.0 and running the subsequent executable gives me:

$ g++ -std=c++17 -I. blah.cpp  # Exact C++ standard doesn't really matter
$ ./a.out 
Eigenvalues found:
          1
9.99978e-13
-0.00120755

I would have expected to get 1, 1 and some small value. In fact, this behavior gets worse as the off-diagonal element gets closer to 1. If I add a couple more 9's onto the end, e.g., M(1, 0) = 0.99999999999999;, I get:

Eigenvalues found:
-1.88223e-16
-6.69222e-16
-7.10329e-16

Conversely, if I take a few 9's off, e.g., M(1, 0) = 0.999999;, I get something more reasonable:

Eigenvalues found:
    1
    1
1e-06

What's the right behavior here? I guess that the shifted-and-inverted value is undefined when sigma and the eigenvalue is zero, so perhaps there is no solution; but I would have hoped for an explicit failure instead of this. Indeed, I get an actual error (factorization failed with the given shift) if I replace the off-diagonal element with 1.

Interestingly enough, RSpectra 0.16-0 (which I assume is running an older version of Spectra) does not have any such problems:

library(RSpectra)
x <- diag(10)
x[2,1] <- 0.999999999999
eigs_sym(x, k=3, which="LM", sigma=0)$values
## [1] 1.000000e+00 1.000000e+00 9.999779e-13

x[2,1] <- 0.99999999999999
eigs_sym(x, k=3, which="LM", sigma=0)$values # A bit wonkier, but good enough
## [1] 1.000000e+00 9.987130e-01 9.992007e-15

Using the regular SymEigsSolver with SortRule::SmallestMagn also seems to work fine.

@yixuan
Copy link
Owner

yixuan commented Sep 12, 2021

I haven't tested the example in details, but theoretically the shift cannot be any of the true eigenvalues. If you know all the eigenvalues are nonnegative and you want to find the smallest one, then you can set the shift to be a small negative number, for example -0.001.

That means you can change the line

SymEigsShiftSolver<DenseSymShiftSolve<double>> eigs(op, 3, 6, 0.0);

to

SymEigsShiftSolver<DenseSymShiftSolve<double>> eigs(op, 3, 6, -0.001);

@LTLA
Copy link
Author

LTLA commented Sep 12, 2021

Thanks @yixuan, using a negative value seems to do the trick. I think my input matrix is always positive semidefinite so it should be safe. It might be worth mentioning that caveat about the shift more prominently in the shift-and-invert documentation, its implications were not obvious. The negative sigma approach would also be useful to describe in the docs; I'd bet that most people looking for the smallest eigenvalues have positive semidefinite matrices as inputs.

@yixuan
Copy link
Owner

yixuan commented Sep 15, 2021

I see. I will improve the documentation in the future. Thanks for the feedback!

@LTLA
Copy link
Author

LTLA commented Sep 30, 2021

Circling back to this. I just tried it out on a much larger matrix, and unfortunately, it didn't work out well. After some effort, I created a reproducible example at https://github.com/LTLA/DebuggingSpectra - the relevant part is in debug1.cpp, executed by running debug1.R (I use R here just to avoid the boilerplate of reading files from disk, the problem itself is in the C++ code).

In debug1.cpp, I try the standard solver with sort-by-smallest-magnitude, which works pretty well (2-3 seconds). Then I try the shift-and-invert, at which point it stalls during the construction of the Spectra::SymEigsShiftSolver. Output looks like this:

Setting up the solver... # this part is for the standard solver
... still setting it up...
Initializing...
Computing...
Done!
 0.00965459
 0.00733508
 0.00564922
3.36169e-17
Setting up the solver... # now trying the shift and invert
... still setting it up... # at this point it's been running for 10 minutes, steadily growing to 4-5 GB of RAM

I should add that it doesn't matter whether I use a negative sigma or a sigma of zero, I still get this stall.

(If you want some - mostly irrelevant - context about where this test data comes from, it's a graph Laplacian constructed from a symmetrized nearest-neighbors graph of the MNIST digits dataset. Some of the code to generate the matrix is provided in debug1.R but requires some effort to manually extract the relevant object within uwot::normalized_laplacian_init.)

@keevindoherty
Copy link

Just wanted to comment that I've been following this thread with interest, as I also struggled initially with setting the shift for eigenvalues close to zero (not realizing that the shift needed to be less than the desired target). Like @LTLA I was also studying graph Laplacians and qualitatively similar matrices. In my case, though, the poor conditioning of the matrices of interest seemed to be the root cause of slow convergence.

While it doesn't address issues with using SymEigsShiftSolver specifically, I thought it would be worth noting that I have had quite a bit of success using Spectra to compute the smallest eigenvalues of PSD matrices using the method outlined here. The idea is to compute the largest magnitude eigenvalue (which is the largest eigenvalue for a PSD matrix), then shift the matrix by twice that value times the identity, i.e. Mshifted = M - 2*lambda_lm*I. The largest magnitude eigenvalues of the shifted matrix are the smallest eigenvalues of M.

@yixuan
Copy link
Owner

yixuan commented Oct 21, 2021

@LTLA I've checked the code and can confirm that the program stalls at the matrix factorization part, which calls the Eigen::SparseLU solver in Eigen. In general the sparse matrix factorization is a pretty hard topic, so it may be helpful to use some external solvers such as SuiteSparse, but that's beyond the scope of Spectra.

In this case I also recommend the approach suggested by @keevindoherty, which works quite well for the matrix in debug1.R. It can be easily done in R:

library(Matrix)
test <- readRDS("debug1.rds")

library(RSpectra)
emax <- eigs_sym(test, 1, opts = list(retvec = FALSE))
test_shift <- test - 2 * emax$values * Diagonal(nrow(test))
e <- eigs_sym(test_shift, 5)
e$values + 2 * emax$values

Output:

[1]  9.897154e-03  9.654586e-03  7.335084e-03  5.649224e-03 -1.065814e-14

@LTLA
Copy link
Author

LTLA commented Oct 25, 2021

Thanks both. It seems that the trick also reports the eigenvectors correctly; I'll try it out with some real data later this week.

@LTLA
Copy link
Author

LTLA commented Oct 28, 2021

After implementing this, and then rereading the comments above, I just realized I managed to misunderstand the math. Instead of subtracting twice the max eigenvalue from the PSD and looking for the largest magnitude eigenvalues, I subtracted the PSD from the max eigenvalue and looked for the largest (signed) eigenvalues. Whoops.

But in my defense, it still works:

library(Matrix)
test <- readRDS("debug1.rds")

library(RSpectra)
emax <- eigs_sym(test, 1, opts = list(retvec = FALSE))
test_shift <- emax$values * Diagonal(nrow(test)) - test
e <- eigs_sym(test_shift, 5)
rev(emax$values - e$values)
## [1]  9.897154e-03  9.654586e-03  7.335084e-03  5.649224e-03 -8.215650e-15

I dunno if there are any implications from doing it this way (e.g., numerical stability, speed). But it is kind of interesting as it opens the door to using alternative algorithms that give the largest eigenvalues but not the largest magnitude eigenvalues.

Regardless of the exact method, if the matrix shifting approach is the way to go, it would be good to throw that in the documentation somewhere, given that both @keevindoherty and I have struggled with the recommended shift-invert approach.

@yixuan
Copy link
Owner

yixuan commented Oct 28, 2021

@LTLA Yes, all these methods are valid in mathematics, but the stability and accuracy may depend on the distribution of all eigenvalues. I agree that it deserves to write a separate document to introduce these tricks.

@jlmelville
Copy link

Sorry to bring this issue back from the dead but I have been in a similar situation to @LTLA (in fact looking at graph laplacians and often using the same MNIST digits as a test case as it reliably causes RSpectra to hang).

I am basically using the shifted approach suggested by @keevindoherty to rule out as many other ways the eigendecomposition can fail, so I am dealing with a graph Laplacian which has been shifted by I * 2 * lambda_max. At this point we know that the condition number of the shifted matrix should be 2, and because it's a graph Laplacian the largest magnitude shifted eigenvalue is close to -2 * lambda_max and the associated eigenvector consists of all 1s. Because we usually want the next-largest eigenvectors and up it seems like those facts about the an eigenvector very close to the desired ones should be useful somehow? Two questions:

  1. Somewhere in the docs (sorry I can't find it so I don't know if it was Spectra itself or Rspectra) I read about how sigma can be used to find eigenvectors associated with specific eigenvalues not just small values. After shifting a matrix by 2 * lambda_max we know that the largest magnitude eigenvector will be close to that value (because there should be a 0 eigenvalue for the unshifted graph laplacian). So would setting sigma = 0.0001 + 2 * lambda_max help even when which = "LM", or is it superfluous? I got the feeling it might be helping but I may be fooling myself.

  2. There is also the opts$initvec vector of which the docs says:

    It may speed up the convergence if initvec is close to an eigenvector of A.

    So I thought I would be clever and use initvec = rep(1, nrow(A)), which is exactly an eigenvector and very close to the eigenvectors I am interested in. This does result in faster convergence but it returns the wrong eigenvectors in several sparse matrices. For a dense10 x 10 shifted laplacian I just tried it on it causes the error TridiagEigen: eigen decomposition failed. I wasn't expecting a vector with identical values would be a good idea but is there any way to use the knowledge of the eigenvector for use with initvec ? Is this another case where we want to be "close" but not "exact" to a solution?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants