Fix `scipy.special.logsumexp` type promotion, test against NumPy and SciPy nightlies #643

agriyakhetarpal · 2024-09-25T17:14:29Z

This PR enables testing against NumPy and SciPy nightlies, which were added in #632 and later disabled to be split out here. I've also fixed a test relating to incorrect shapes being cast (scalars were not being treated as arrays) within the combo_check function, so all tests should now pass on nightly and stable versions.

Related to #630

agriyakhetarpal · 2024-09-25T19:30:36Z

The great thing here is that the previous failing test for PyPy 3.9 on Windows does not fail anymore, because the NumPy development version available on https://anaconda.org/scientific-python-nightly-wheels/numpy/files does not support PyPy 3.9 anymore (so, it fails beforehand when installing). The PyPy 3.10 on Windows tests pass, as expected.

I think what's left to do here is to add some logic to determine when to run the nightly tests and to skip them if a wheel is not available for the desired Python implementation + version duo. This might require a bit of hacking and setting up a dynamic GHA matrix with some JSON – I should be able to figure that out.

fjosw · 2024-10-13T20:48:45Z

Hey @agriyakhetarpal, sorry I got a bit lost and am trying to catch up. I had a look at the failing windows pypy test and this looks quite odd to me. To me it is still not clear to which degree we need to explicitly ensure that the code is also working properly with pypy considering the additional maintenance cost that comes with it. From my perspective it would probably be more valueable to focus on potential issues with future CPython and NumPy versions so I appreciate your work towards nightly tests. Is there any explict input you need for this PR?

agriyakhetarpal · 2024-10-15T15:40:07Z

Thanks for offering to see through this, @fjosw! As discussed on 14/10/2024, I'll rework this PR to make it simpler. Putting the plan out in public here:

as a pure Python package, we'll test the NumPy nightlies against just the latest Python version for now (3.13 at the time of writing) on all platforms instead of trying to devise a convoluted solution that tests all Python versions that are available for the nightlies. If that strains our CI (I don't think it would, since our CI is quite fast now), we'll test just Linux and Python 3.13. We'll avoid testing against PyPy for the nightlies, since we are not sure if we have enough time to fix and escalate bugs that can arise on a not-as-conventional Python implementation.
We'll test against free-threaded NumPy on Python 3.13t locally first (I'm yet to do it on my M-series macOS device). If there's a bug, that will be organised as a bug report for NumPy upstream and we can organise a CI job with free-threaded wheels for the three platforms that have wheels. https://py-free-threading.github.io/ is a very helpful resource about free-threading and offers advice on how to set up testing infrastructure related to this.
Since we are testing against the nightlies anyway, we should be fine with a slightly more relaxed upper bound on NumPy instead of the recommended N+3 minor version number (2.4). We can leave it as <3 or ~2, which helps us since we are in maintenance mode right now.

I'll split out the second and third points into their own, separate PRs. We should be good to go with this PR once I address the first point!

agriyakhetarpal · 2024-10-16T16:12:08Z

The test_logsumexp1 failure looks quite cryptic to me...

agriyakhetarpal · 2024-12-01T14:31:31Z

I fixed the tests, and all of them are passing now! Two things I observed:

the combo_check function predates many improvements made in testing infrastructure over recent years, and is quite dated now. I propose that we refactor the test suite to use pytest better and pave the way for it to be nicer for maintainers and new contributors.
when running the tests, the tracebacks do not go back into NumPy/SciPy APIs, but stop at our code itself (I haven't looked into the details as to why).

The combination of both of these things can sometimes restrict debugging (or so I had the experience of) when fixing the failing test – I had to check multiple angles (fwd vs rev modes, randn, keepdims set to True vs False, combo_check) before arriving to a conclusion. If we were to deprecate the combo_check function and change to a few simple pytest parametrization decorators, we would know exactly what sub-test fails in a test, and where – i.e., which would better isolate the failure instead of us having to do it ourselves.

Would something like this be a reasonable proposition?

j-towns · 2024-12-02T08:12:46Z

the combo_check function predates many improvements made in testing infrastructure over recent years, and is quite dated now. I propose that we refactor the test suite to use pytest better and pave the way for it to be nicer for maintainers and new contributors.

I agree with this idea. Presumably we should be able to get roughly equivalent tests using pytest parameterized tests. If you do implement this it might be worth checking that it doesn't slow the tests down (I like the fact that our tests are really fast).

when running the tests, the tracebacks do not go back into NumPy/SciPy APIs, but stop at our code itself (I haven't looked into the details as to why).

This might be from an error occurring during the backward pass of reverse-mode autodiff. When an error occurs during the backward pass you can't get a traceback to the corresponding point in user code. It might be something else though - feel free to paste an example if you hit this again and I can look into it.

agriyakhetarpal · 2024-12-02T13:38:53Z

I agree with this idea. Presumably we should be able to get roughly equivalent tests using pytest parameterized tests. If you do implement this it might be worth checking that it doesn't slow the tests down (I like the fact that our tests are really fast).

Yes, I do think it won't slow the tests down, but OTOH, it would actually make them faster by a bit – right now, the combo_check function has a single-threaded implementation. So, while we are running tests in parallel with pytest-xdist, we lock a particular test that uses combo_check under a single worker. If we replace such a test case by parametrized arguments, pytest should be able to parallelise them better, and we should be able to have more workers completing shorter tests and freeing themselves earlier. Locally, I have also noticed that it takes longer for me to gather enough workers to run the tests rather than run the tests themselves, due to Python's abstractions over system calls and the like (-n 0 is sometimes much faster for me than -n 1!)

This might be from an error occurring during the backward pass of reverse-mode autodiff. When an error occurs during the backward pass you can't get a traceback to the corresponding point in user code. It might be something else though - feel free to paste an example if you hit this again and I can look into it.

Will do! I shall open a separate issue for this right now, though one example is the reverse mode case for logsumexp that was failing here prior to the change I made.

Thank you for the review!

This change is as discussed between Jamie, Fabian and me; and subsequently posted publicly in #643 (comment). Pinning to NumPy <= 2.3 would also be an ideal option, but we have opted for a more relaxed pin to NumPy <3, owing to the fact that we have limited availability to add new features and hence fewer chances of anything major breaking. This change is being merged after #643, and has been split from it because it is not directly related to the proposition of adding tests against nightly dependencies.

Test against NumPy nightlies

8632a9d

agriyakhetarpal marked this pull request as draft September 25, 2024 19:30

agriyakhetarpal added 2 commits October 15, 2024 21:59

Merge branch 'master' into feat/nightly-numpy-tests

59bbdab

Split into a nightly test job, rework the workflow

1b0c3da

agriyakhetarpal mentioned this pull request Oct 15, 2024

chore: drop Python 3.8 and start testing against Python 3.13 #648

Merged

Merge branch 'master' into feat/nightly-numpy-tests

6358800

agriyakhetarpal added 3 commits December 1, 2024 19:30

Fix type demotion issue in combo_check

b48ebd0

Merge branch 'master' into feat/nightly-numpy-tests

2ae5302

Enforce binary wheels are downloaded from SPNW

e384483

agriyakhetarpal changed the title ~~Add testing against NumPy nightlies~~ Fix scipy.special.logsumexp type promotion, test against NumPy and SciPy nightlies Dec 1, 2024

agriyakhetarpal marked this pull request as ready for review December 1, 2024 14:31

agriyakhetarpal requested review from j-towns and fjosw December 1, 2024 14:31

This was referenced Dec 1, 2024

Pin to NumPy <3 #658

Merged

Free-threaded/noGIL Python 3.13t testing experiment #659

Draft

j-towns approved these changes Dec 2, 2024

View reviewed changes

agriyakhetarpal merged commit b204c2f into master Dec 2, 2024
34 checks passed

agriyakhetarpal deleted the feat/nightly-numpy-tests branch December 2, 2024 13:39

agriyakhetarpal mentioned this pull request Dec 2, 2024

Traceback not displayed if backward pass errors in reverse mode #660

Closed

agriyakhetarpal mentioned this pull request Dec 2, 2024

Do away with combo_check and switch to test parametrization #662

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix `scipy.special.logsumexp` type promotion, test against NumPy and SciPy nightlies #643

Fix `scipy.special.logsumexp` type promotion, test against NumPy and SciPy nightlies #643

agriyakhetarpal commented Sep 25, 2024 •

edited

Loading

agriyakhetarpal commented Sep 25, 2024

fjosw commented Oct 13, 2024

agriyakhetarpal commented Oct 15, 2024

agriyakhetarpal commented Oct 16, 2024

agriyakhetarpal commented Dec 1, 2024

j-towns commented Dec 2, 2024

agriyakhetarpal commented Dec 2, 2024

Fix scipy.special.logsumexp type promotion, test against NumPy and SciPy nightlies #643

Fix scipy.special.logsumexp type promotion, test against NumPy and SciPy nightlies #643

Conversation

agriyakhetarpal commented Sep 25, 2024 • edited Loading

agriyakhetarpal commented Sep 25, 2024

fjosw commented Oct 13, 2024

agriyakhetarpal commented Oct 15, 2024

agriyakhetarpal commented Oct 16, 2024

agriyakhetarpal commented Dec 1, 2024

j-towns commented Dec 2, 2024

agriyakhetarpal commented Dec 2, 2024

Fix `scipy.special.logsumexp` type promotion, test against NumPy and SciPy nightlies #643

Fix `scipy.special.logsumexp` type promotion, test against NumPy and SciPy nightlies #643

agriyakhetarpal commented Sep 25, 2024 •

edited

Loading