Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EOF tests are failing in CI for Windows #463

Closed
kafitzgerald opened this issue Sep 12, 2023 · 9 comments · Fixed by #516
Closed

EOF tests are failing in CI for Windows #463

kafitzgerald opened this issue Sep 12, 2023 · 9 comments · Fixed by #516
Assignees
Labels
bug Something isn't working

Comments

@kafitzgerald
Copy link
Contributor

Describe the bug
Several of the EOF tests are failing in CI for Windows. You can see this in PR #460 where Windows was added to the testing matrix.

Specifically, there appears to be a sign error in the resulting arrays for the EOF functions.

Expected behavior
Tests should pass.

OS:
Windows

Additional notes:
I only looked into this briefly and didn't see anything offhand.

It does look like the eofs package we depend upon supports Windows.

We probably need to sort this out before merging PR #460.

@cyschneck
Copy link
Contributor

EOFS Documentation

eofs works on Python 2 or 3 on Linux, Windows or OSX

@kafitzgerald
Copy link
Contributor Author

Here's a direct link to where the GeoCAT-comp eofs tests are failing for Windows in CI: https://github.com/NCAR/geocat-comp/actions/runs/6131275834/job/16642482498?pr=460

@cyschneck
Copy link
Contributor

Windows Test Failures: 10 tests

FAILED test/test_stats.py::Test_eof::test_eof_00
FAILED test/test_stats.py::Test_eof::test_eof_deprecated
FAILED test/test_stats.py::Test_eof::test_eof_01
FAILED test/test_stats.py::Test_eof::test_eof_02
FAILED test/test_stats.py::Test_eof::test_eof_14
FAILED test/test_stats.py::Test_eof::test_eof_15
FAILED test/test_stats.py::Test_eof::test_eof_16
FAILED test/test_stats.py::Test_eof::test_eof_n_01
FAILED test/test_stats.py::Test_eof::test_eof_n_03
FAILED test/test_stats.py::Test_eof::test_eof_n_03_1

AssertionError: Arrays are not almost equal to 5 decimals shared among all test failures where the arrays are equal but with sign inverted

E       AssertionError:
E       Arrays are not almost equal to 5 decimals
E
E       Mismatched elements: 16 / 16 (100%)
E       Max absolute difference: 0.5
E       Max relative difference: 2.
E        x: array([[[0.25, 0.25, 0.25, 0.25],
E               [0.25, 0.25, 0.25, 0.25],
E               [0.25, 0.25, 0.25, 0.25],
E               [0.25, 0.25, 0.25, 0.25]]])
E        y: array([[[-0.25, -0.25, -0.25, -0.25],
E               [-0.25, -0.25, -0.25, -0.25],
E               [-0.25, -0.25, -0.25, -0.25],
E               [-0.25, -0.25, -0.25, -0.25]]])

@cyschneck
Copy link
Contributor

stats.py: 244

eofs = solver.eofs(neofs=neofs, eofscaling=eofscaling)

solver defined stats.py:232

data, solver = _generate_eofs_solver(data,
                                         time_dim=time_dim,
                                         weights=weights,
                                         center=center,
                                         ddof=ddof)

_generate_eofs_solver defined stats.py:72

@cyschneck
Copy link
Contributor

cyschneck commented Nov 13, 2023

Potential source of issue: numpy (v. 1.23.5)

Running np.linalg.svd on Windows:

Python 3.11.6 | packaged by conda-forge | (main, Oct  3 2023, 10:29:11) [MSC v.1935 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> a = np.array([[2, 2], [1, 1]])
>>> u, s, Vh = np.linalg.svd(a, full_matrices=False)
>>> u
array([[-0.89442719, -0.4472136 ],
       [-0.4472136 ,  0.89442719]])
>>> s
array([3.16227766e+00, 1.10062118e-17])
>>> Vh
array([[-0.70710678, -0.70710678],
       [ 0.70710678, -0.70710678]])

Running np.linalg.svd on Linux:

Python 3.11.6 | packaged by conda-forge | (main, Oct  3 2023, 10:40:35) [GCC 12.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> a = np.array([[2, 2], [1,1]])
>>> u, s, Vh = np.linalg.svd(a, full_matrices=False)
>>> u
array([[-0.89442719, -0.4472136 ],
       [-0.4472136 ,  0.89442719]])
>>> s
array([3.16227766, 0.        ])
>>> Vh
array([[-0.70710678, -0.70710678],
       [-0.70710678,  0.70710678]])

Across platforms, the output of S (Vectors with singular vectors) is different. On Windows S=array([3.16227766e+00, 1.10062118e-17]) and on Linux S=array([3.16227766, 0. ])

@philipc2
Copy link
Member

Potential source of issue: numpy (v. 1.23.5)

Running np.linalg.svd on Windows:

Python 3.11.6 | packaged by conda-forge | (main, Oct  3 2023, 10:29:11) [MSC v.1935 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> a = np.array([[2, 2], [1, 1]])
>>> u, s, Vh = np.linalg.svd(a, full_matrices=False)
>>> u
array([[-0.89442719, -0.4472136 ],
       [-0.4472136 ,  0.89442719]])
>>> s
array([3.16227766e+00, 1.10062118e-17])
>>> Vh
array([[-0.70710678, -0.70710678],
       [ 0.70710678, -0.70710678]])

Running np.linalg.svd on Linux:

Python 3.11.6 | packaged by conda-forge | (main, Oct  3 2023, 10:40:35) [GCC 12.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> a = np.array([[2, 2], [1,1]])
>>> u, s, Vh = np.linalg.svd(a, full_matrices=False)
>>> u
array([[-0.89442719, -0.4472136 ],
       [-0.4472136 ,  0.89442719]])
>>> s
array([3.16227766, 0.        ])
>>> Vh
array([[-0.70710678, -0.70710678],
       [-0.70710678,  0.70710678]])

Across platforms, the output of S (Vectors with singular vectors) is different. On Windows S=array([3.16227766e+00, 1.10062118e-17]) and on Linux S=array([3.16227766, 0. ])

For the resulting S, are you sure the numbers are different? It looks like on Windows its printing all the significant figures, while on Linux it truncates it when printing.

@cyschneck
Copy link
Contributor

cyschneck commented Nov 13, 2023

Using the same data array on both Linux and Windows: [[2, 2], [1,1]]

Windows:

Python 3.11.6 | packaged by conda-forge | (main, Oct  3 2023, 10:29:11) [MSC v.1935 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> d = np.array([[2, 2], [1, 1]])
>>> u, s, vH = np.linalg.svd(d, full_matrices=False)
>>> vH
array([[-0.70710678, -0.70710678],
       [ 0.70710678, -0.70710678]])

Linux:

Python 3.11.6 | packaged by conda-forge | (main, Oct  3 2023, 10:40:35) [GCC 12.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> d = np.array([[2, 2], [1, 1]])
>>> u, s, vH = np.linalg.svd(d, full_matrices=False)
>>> vH
array([[-0.70710678, -0.70710678],
       [-0.70710678,  0.70710678]])

The second values are flipped in Linux from Windows from [ 0.70710678, -0.70710678] to [-0.70710678, 0.70710678]

@cyschneck
Copy link
Contributor

cyschneck commented Nov 14, 2023

Seems like this is a result of when having duplicate singular values the SVD is not unique

When you have duplicate singular values, as you do here, the SVD is not unique. The vectors associated with the duplicate singular values can be rotated freely. Different versions of the underlying linear algebra library may take different paths and return different choices in such cases. Both versions of the returned matrices are correct.

This can be also seen when using scipy.linalg.svd that gives the same output as np.linalg.svd

Windows:

Python 3.11.6 | packaged by conda-forge | (main, Oct  3 2023, 10:29:11) [MSC v.1935 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> import scipy
>>> d = np.array([[2, 2], [1, 1]])
>>> u, s, vH = scipy.linalg.svd(d, full_matrices=False)
>>> vH
array([[-0.70710678, -0.70710678],
       [ 0.70710678, -0.70710678]])

Linux:

Python 3.11.6 | packaged by conda-forge | (main, Oct  3 2023, 10:40:35) [GCC 12.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> import scipy
>>> d = np.array([[2, 2], [1, 1]])
>>> u, s, vH = scipy.linalg.svd(d, full_matrices=False)
>>> vH
array([[-0.70710678, -0.70710678],
       [-0.70710678,  0.70710678]])

The magnitude of values is correct, but the signs can differ

In order to check that the SVD is correct you need to check that the matrices u and v are indeed unitary and that x = u @ np.diag(s) @ vh. If both conditions hold, than the SVD is correct, otherwise it isn't.

SVD of a non-square matrix is not unique in U and V. Even if you have a square matrix with non-zero, non-degenerate singular values, singular vectors in U and V are only unique up to a sign factor

@cyschneck
Copy link
Contributor

cyschneck commented Nov 15, 2023

From the NCL Graphics: EOFS

Note on signs of EOF analysis (contributed by Andrew Dawson, U. East Anglia)

EOFs are eigenvectors of the covariance matrix formed from the input data. Since an eigenvector can be multiplied by any scalar and still remain an eigenvector, the sign is arbitrary. In a mathematical sense the sign of an eigenvector is rather unimportant. This is why the EOF analysis may yield different signed EOFs for slightly different inputs. Sign only becomes an issue when you wish to interpret the physical meaning (if any) of an eigenvector.

You should approach the interpretation of EOFs by looking at both the EOF pattern and the associated time series together. For example, consider an EOF of sea surface temperature. If your EOF has a positive centre and the associated time series is increasing, then you will interpret this centre as a warming signal. If your EOF had come out the other sign (ie. a negative centre), then the associated time series would also be the opposite sign and you would still interpret the centre as a warming signal.

In essence, the sign flip does not change the physical interpretation of the result. Hence, it is up to you to choose which sign to associate with your EOF patterns for visualisation (remembering that any sign change to an EOF must be applied to the associated time series also). Usually you would simply adjust the sign so that all your EOF patterns with the same physical interpretation also look the same

It is possible that the slightly different way that Windows rounds and handles numbers on the back end is resulting in slightly different inputs that are yielding different signed EOFs. But it appears that the sign of the array might be irrelevant to the stats.py output based on how EOFs work so the tests might currently be too strict

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants