Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run prototype datasets test also on Windows and macOS? #5550

Closed
pmeier opened this issue Mar 4, 2022 · 3 comments · Fixed by #5914
Closed

Run prototype datasets test also on Windows and macOS? #5550

pmeier opened this issue Mar 4, 2022 · 3 comments · Fixed by #5914

Comments

@pmeier
Copy link
Collaborator

pmeier commented Mar 4, 2022

There are now three separate occasions of bugs in the prototype datasets that were not picked up by our CI.

We didn't notice them, because they happen on Windows on macOS, but we only run the prototype tests on Linux. Given that the prototypes work with files and the path handling is different on all platforms, this can be problematic.

The torchdata team brought the failures to our attention, since they run our dataset tests on the full matrix. I propose we also run the prototype datasets tests on Windows and macOS to avoid these post mortems. To limit the extra needed CI resources I would only run the tests only on Python3.7.

That means we are looking at 2 extra CI runs per PR with roughly 10 minutes runtime.

cc @pmeier @seemethere @bjuncek

@pmeier
Copy link
Collaborator Author

pmeier commented Mar 25, 2022

After #5618, the total runtime decreased to about 5 minutes, in which the largest chunk is taken up by building rather than testing.

@pmeier
Copy link
Collaborator Author

pmeier commented Apr 28, 2022

Another error not picked up by our CI: #5801 (comment).

@datumbox
Copy link
Contributor

datumbox commented Apr 28, 2022

As discussed offline with @pmeier, this is the 4th time that platform specific issues creep in on datasets and need to be tested on each platform. Given that the prototype execution time is only 5 minutes, adding 2 jobs for Windows and macOS sounds like a reasonable tradeoff.

@seemethere @bigfootjon Just highlighting this case as another example of why often we need to run tests across different setups. This is something that was debated at #5479, so I thought to provide a bit more evidence. This of course doesn't mean that we shouldn't reduce the costs. I think the solution is to restructure the CI jobs to separate tests that can break on different platforms from those that don't. This way we can reduce significantly the costs by running across all configs only those that must be run everywhere. We are in a bit of a crunch resource-wise but we hope we could pick this up within Q2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants