Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix scipy.special.logsumexp type promotion, test against NumPy and SciPy nightlies #643

Merged
merged 7 commits into from
Dec 2, 2024

Conversation

agriyakhetarpal
Copy link
Collaborator

@agriyakhetarpal agriyakhetarpal commented Sep 25, 2024

This PR enables testing against NumPy and SciPy nightlies, which were added in #632 and later disabled to be split out here. I've also fixed a test relating to incorrect shapes being cast (scalars were not being treated as arrays) within the combo_check function, so all tests should now pass on nightly and stable versions.

Related to #630

@agriyakhetarpal
Copy link
Collaborator Author

The great thing here is that the previous failing test for PyPy 3.9 on Windows does not fail anymore, because the NumPy development version available on https://anaconda.org/scientific-python-nightly-wheels/numpy/files does not support PyPy 3.9 anymore (so, it fails beforehand when installing). The PyPy 3.10 on Windows tests pass, as expected.

I think what's left to do here is to add some logic to determine when to run the nightly tests and to skip them if a wheel is not available for the desired Python implementation + version duo. This might require a bit of hacking and setting up a dynamic GHA matrix with some JSON – I should be able to figure that out.

@agriyakhetarpal agriyakhetarpal marked this pull request as draft September 25, 2024 19:30
@fjosw
Copy link
Collaborator

fjosw commented Oct 13, 2024

Hey @agriyakhetarpal, sorry I got a bit lost and am trying to catch up. I had a look at the failing windows pypy test and this looks quite odd to me. To me it is still not clear to which degree we need to explicitly ensure that the code is also working properly with pypy considering the additional maintenance cost that comes with it. From my perspective it would probably be more valueable to focus on potential issues with future CPython and NumPy versions so I appreciate your work towards nightly tests. Is there any explict input you need for this PR?

@agriyakhetarpal
Copy link
Collaborator Author

Thanks for offering to see through this, @fjosw! As discussed on 14/10/2024, I'll rework this PR to make it simpler. Putting the plan out in public here:

  • as a pure Python package, we'll test the NumPy nightlies against just the latest Python version for now (3.13 at the time of writing) on all platforms instead of trying to devise a convoluted solution that tests all Python versions that are available for the nightlies. If that strains our CI (I don't think it would, since our CI is quite fast now), we'll test just Linux and Python 3.13. We'll avoid testing against PyPy for the nightlies, since we are not sure if we have enough time to fix and escalate bugs that can arise on a not-as-conventional Python implementation.
  • We'll test against free-threaded NumPy on Python 3.13t locally first (I'm yet to do it on my M-series macOS device). If there's a bug, that will be organised as a bug report for NumPy upstream and we can organise a CI job with free-threaded wheels for the three platforms that have wheels. https://py-free-threading.github.io/ is a very helpful resource about free-threading and offers advice on how to set up testing infrastructure related to this.
  • Since we are testing against the nightlies anyway, we should be fine with a slightly more relaxed upper bound on NumPy instead of the recommended N+3 minor version number (2.4). We can leave it as <3 or ~2, which helps us since we are in maintenance mode right now.

I'll split out the second and third points into their own, separate PRs. We should be good to go with this PR once I address the first point!

@agriyakhetarpal
Copy link
Collaborator Author

The test_logsumexp1 failure looks quite cryptic to me...

@agriyakhetarpal agriyakhetarpal changed the title Add testing against NumPy nightlies Fix scipy.special.logsumexp type promotion, test against NumPy and SciPy nightlies Dec 1, 2024
@agriyakhetarpal
Copy link
Collaborator Author

I fixed the tests, and all of them are passing now! Two things I observed:

  • the combo_check function predates many improvements made in testing infrastructure over recent years, and is quite dated now. I propose that we refactor the test suite to use pytest better and pave the way for it to be nicer for maintainers and new contributors.
  • when running the tests, the tracebacks do not go back into NumPy/SciPy APIs, but stop at our code itself (I haven't looked into the details as to why).

The combination of both of these things can sometimes restrict debugging (or so I had the experience of) when fixing the failing test – I had to check multiple angles (fwd vs rev modes, randn, keepdims set to True vs False, combo_check) before arriving to a conclusion. If we were to deprecate the combo_check function and change to a few simple pytest parametrization decorators, we would know exactly what sub-test fails in a test, and where – i.e., which would better isolate the failure instead of us having to do it ourselves.

Would something like this be a reasonable proposition?

@agriyakhetarpal agriyakhetarpal marked this pull request as ready for review December 1, 2024 14:31
@j-towns
Copy link
Collaborator

j-towns commented Dec 2, 2024

the combo_check function predates many improvements made in testing infrastructure over recent years, and is quite dated now. I propose that we refactor the test suite to use pytest better and pave the way for it to be nicer for maintainers and new contributors.

I agree with this idea. Presumably we should be able to get roughly equivalent tests using pytest parameterized tests. If you do implement this it might be worth checking that it doesn't slow the tests down (I like the fact that our tests are really fast).

when running the tests, the tracebacks do not go back into NumPy/SciPy APIs, but stop at our code itself (I haven't looked into the details as to why).

This might be from an error occurring during the backward pass of reverse-mode autodiff. When an error occurs during the backward pass you can't get a traceback to the corresponding point in user code. It might be something else though - feel free to paste an example if you hit this again and I can look into it.

@agriyakhetarpal
Copy link
Collaborator Author

I agree with this idea. Presumably we should be able to get roughly equivalent tests using pytest parameterized tests. If you do implement this it might be worth checking that it doesn't slow the tests down (I like the fact that our tests are really fast).

Yes, I do think it won't slow the tests down, but OTOH, it would actually make them faster by a bit – right now, the combo_check function has a single-threaded implementation. So, while we are running tests in parallel with pytest-xdist, we lock a particular test that uses combo_check under a single worker. If we replace such a test case by parametrized arguments, pytest should be able to parallelise them better, and we should be able to have more workers completing shorter tests and freeing themselves earlier. Locally, I have also noticed that it takes longer for me to gather enough workers to run the tests rather than run the tests themselves, due to Python's abstractions over system calls and the like (-n 0 is sometimes much faster for me than -n 1!)

This might be from an error occurring during the backward pass of reverse-mode autodiff. When an error occurs during the backward pass you can't get a traceback to the corresponding point in user code. It might be something else though - feel free to paste an example if you hit this again and I can look into it.

Will do! I shall open a separate issue for this right now, though one example is the reverse mode case for logsumexp that was failing here prior to the change I made.

Thank you for the review!

@agriyakhetarpal agriyakhetarpal merged commit b204c2f into master Dec 2, 2024
34 checks passed
@agriyakhetarpal agriyakhetarpal deleted the feat/nightly-numpy-tests branch December 2, 2024 13:39
agriyakhetarpal added a commit that referenced this pull request Dec 2, 2024
This change is as discussed between Jamie, Fabian and me; and
subsequently posted publicly in
#643 (comment).
Pinning to NumPy <= 2.3 would also be an ideal option, but we have opted
for a more relaxed pin to NumPy <3, owing to the fact that we have
limited availability to add new features and hence fewer chances of
anything major breaking.

This change is being merged after #643, and has been split from it
because it is not directly related to the proposition of adding tests
against nightly dependencies.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants