Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timeout due to excessive shrinking in Lin Bytes test with Domain on Cygwin #520

Open
jmid opened this issue Jan 16, 2025 · 4 comments · Fixed by #521
Open

Timeout due to excessive shrinking in Lin Bytes test with Domain on Cygwin #520

jmid opened this issue Jan 16, 2025 · 4 comments · Fixed by #521
Labels
test suite reliability Issue concerns tests that should behave more predictably

Comments

@jmid
Copy link
Collaborator

jmid commented Jan 16, 2025

On the merge to main of #517 we saw a timeout on Cygwin trunk due to excessive shrinking in the Lin Bytes test with Domain test:
https://github.com/ocaml-multicore/multicoretests/actions/runs/12796162913/job/35675203217

random seed: 306961601
generated error fail pass / total     time test name

[ ]    0    0    0    0 / 1000     0.0s Lin Bytes test with Domain
[ ]    0    0    0    0 / 1000     0.0s Lin Bytes test with Domain (generating)
[ ]   85    0    0   85 / 1000    62.4s Lin Bytes test with Domain (shrinking:    3.0014)
[ ]   85    0    0   85 / 1000   125.7s Lin Bytes test with Domain (shrinking:    5.0022)
[ ]   85    0    0   85 / 1000   186.3s Lin Bytes test with Domain (shrinking:    6.0025)
[ ]   85    0    0   85 / 1000   250.2s Lin Bytes test with Domain (shrinking:    7.0002)
[ ]   85    0    0   85 / 1000   312.7s Lin Bytes test with Domain (shrinking:    9.0003)
[ ]   85    0    0   85 / 1000   374.2s Lin Bytes test with Domain (shrinking:    9.0030)
[ ]   85    0    0   85 / 1000   434.2s Lin Bytes test with Domain (shrinking:   12.0004)
[ ]   85    0    0   85 / 1000   494.3s Lin Bytes test with Domain (shrinking:   19.0051)
[ ]   85    0    0   85 / 1000   554.5s Lin Bytes test with Domain (shrinking:   31.0010)
[ ]   85    0    0   85 / 1000   614.6s Lin Bytes test with Domain (shrinking:   41.0003)
[ ]   85    0    0   85 / 1000   678.1s Lin Bytes test with Domain (shrinking:   44)
[ ]   85    0    0   85 / 1000   780.8s Lin Bytes test with Domain (shrinking:   46)
[ ]   85    0    0   85 / 1000   881.0s Lin Bytes test with Domain (shrinking:   47)
[ ]   85    0    0   85 / 1000   969.1s Lin Bytes test with Domain (shrinking:   48)
[ ]   85    0    0   85 / 1000  1045.8s Lin Bytes test with Domain (shrinking:   49)
[ ]   85    0    0   85 / 1000  1114.6s Lin Bytes test with Domain (shrinking:   53)
[ ]   85    0    0   85 / 1000  1174.7s Lin Bytes test with Domain (shrinking:   56.0014)
[ ]   85    0    0   85 / 1000  1244.6s Lin Bytes test with Domain (shrinking:   60)
[ ]   85    0    0   85 / 1000  1320.1s Lin Bytes test with Domain (shrinking:   64)
[ ]   85    0    0   85 / 1000  1380.2s Lin Bytes test with Domain (shrinking:   69.0004)
[ ]   85    0    0   85 / 1000  1440.2s Lin Bytes test with Domain (shrinking:   73.0066)
[ ]   85    0    0   85 / 1000  1500.3s Lin Bytes test with Domain (shrinking:   84.0007)
[ ]   85    0    0   85 / 1000  1560.4s Lin Bytes test with Domain (shrinking:  112.0005)
[ ]   85    0    0   85 / 1000  1620.6s Lin Bytes test with Domain (shrinking:  134.0023)
[ ]   85    0    0   85 / 1000  1680.6s Lin Bytes test with Domain (shrinking:  151.0016)
[ ]   85    0    0   85 / 1000  4331.4s Lin Bytes test with Domain (shrinking:  165.0010)
[ ]   85    0    0   85 / 1000  4396.8s Lin Bytes test with Domain (shrinking:  165.0012)
[ ]   85    0    0   85 / 1000  4456.9s Lin Bytes test with Domain (shrinking:  177.0024)
[ ]   85    0    0   85 / 1000  4517.2s Lin Bytes test with Domain (shrinking:  190.0008)
[✓]   86    0    1   85 / 1000  4552.0s Lin Bytes test with Domain

[ ]    0    0    0    0 /  250     0.0s Lin Bytes test with Thread
Error: The operation was canceled.

That's 1h 14 min ~ 74min ((4552.0 -. 62.4) /. 60.) spent reducing a counterexample.

It is again on Cygwin - and very similar to #514, which may indicate a Cygwin issue rather than a Lin shrinking issue...

@jmid
Copy link
Collaborator Author

jmid commented Jan 16, 2025

There may be an issue with pthreads in Cygwin.

On the merge to main of #519 for the Cygwin 5.3 workflow we just saw excessive time spent in Lin Bytes test with Thread
https://github.com/ocaml-multicore/multicoretests/actions/runs/12808801661/job/35712309410

random seed: 366364260
generated error fail pass / total     time test name

[ ]    0    0    0    0 / 2500     0.0s Lin Bytes test with Domain
[ ]    0    0    0    0 / 2500     0.0s Lin Bytes test with Domain (generating)
[ ]   37    0    0   37 / 2500    60.2s Lin Bytes test with Domain (shrinking:   15.0003)
[ ]   37    0    0   37 / 2500   122.0s Lin Bytes test with Domain (shrinking:   18.0006)
[ ]   37    0    0   37 / 2500   182.5s Lin Bytes test with Domain (shrinking:   20.0008)
[ ]   37    0    0   37 / 2500   245.1s Lin Bytes test with Domain (shrinking:   23.0005)
[ ]   37    0    0   37 / 2500   308.1s Lin Bytes test with Domain (shrinking:   25.0002)
[ ]   37    0    0   37 / 2500   371.9s Lin Bytes test with Domain (shrinking:   28.0010)
[ ]   37    0    0   37 / 2500   433.4s Lin Bytes test with Domain (shrinking:   30.0013)
[ ]   37    0    0   37 / 2500   493.5s Lin Bytes test with Domain (shrinking:   32.0012)
[ ]   37    0    0   37 / 2500   556.0s Lin Bytes test with Domain (shrinking:   34.0002)
[ ]   37    0    0   37 / 2500   617.4s Lin Bytes test with Domain (shrinking:   35.0012)
[ ]   37    0    0   37 / 2500   677.9s Lin Bytes test with Domain (shrinking:   36.0015)
[ ]   37    0    0   37 / 2500   741.5s Lin Bytes test with Domain (shrinking:   37.0013)
[ ]   37    0    0   37 / 2500   806.4s Lin Bytes test with Domain (shrinking:   39.0004)
[ ]   37    0    0   37 / 2500   866.8s Lin Bytes test with Domain (shrinking:   40)
[ ]   37    0    0   37 / 2500   927.3s Lin Bytes test with Domain (shrinking:   41.0003)
[✓]   38    0    1   37 / 2500   984.5s Lin Bytes test with Domain

[ ]    0    0    0    0 /  250     0.0s Lin Bytes test with Thread
[ ]    3    0    0    3 /  250  1481.8s Lin Bytes test with Thread
[ ]    4    0    0    4 /  250  8943.0s Lin Bytes test with Thread (collecting)
Error: The operation was canceled.

That's 149min ~ 2h 29min to test 4 random inputs on the Thread variant? 🤔

@jmid
Copy link
Collaborator Author

jmid commented Jan 17, 2025

This happened again in #521 on Cygwin trunk
https://github.com/ocaml-multicore/multicoretests/actions/runs/12810674241/job/35718270726?pr=521

random seed: 389881736
generated error fail pass / total     time test name

[ ]    0    0    0    0 / 2500     0.0s Lin Bytes test with Domain
[ ]    0    0    0    0 / 2500     0.0s Lin Bytes test with Domain (generating)
[ ]  151    0    0  151 / 2500    60.0s Lin Bytes test with Domain (shrinking:   36.0011)
[ ]  151    0    0  151 / 2500   120.3s Lin Bytes test with Domain (shrinking:   40.0010)
[ ]  151    0    0  151 / 2500   184.0s Lin Bytes test with Domain (shrinking:   42.0015)

[...]

[ ]  151    0    0  151 / 2500 10444.5s Lin Bytes test with Domain (shrinking:  158.0007)
[ ]  151    0    0  151 / 2500 10540.1s Lin Bytes test with Domain (shrinking:  158.0008)
[ ]  151    0    0  151 / 2500 10633.5s Lin Bytes test with Domain (shrinking:  158.0009)
Error: The operation was canceled.

Timing out after 10633.5s!

@jmid
Copy link
Collaborator Author

jmid commented Jan 17, 2025

I opened #526 to track occurrences of Lin Bytes test with Thread timeouts seperately.

@jmid jmid closed this as completed in #521 Jan 19, 2025
@jmid jmid reopened this Jan 22, 2025
@jmid
Copy link
Collaborator Author

jmid commented Jan 22, 2025

On 9621d6d pushed to main in connection with the 0.7 release I observed this again on Cygwin trunk:
https://github.com/ocaml-multicore/multicoretests/actions/runs/12891457981/job/35943437614

random seed: 237443094
generated error fail pass / total     time test name

[ ]    0    0    0    0 / 5000     0.0s Lin Bytes test with Domain
[ ]    0    0    0    0 / 5000     0.0s Lin Bytes test with Domain (generating)
[ ]   35    0    0   35 / 5000    60.6s Lin Bytes test with Domain (shrinking:   22.0013)
[ ]   35    0    0   35 / 5000   122.6s Lin Bytes test with Domain (shrinking:   26.0005)
[ ]   35    0    0   35 / 5000   183.5s Lin Bytes test with Domain (shrinking:   27.0010)

[...]

[ ]   35    0    0   35 / 5000  6985.6s Lin Bytes test with Domain (shrinking:   57.0054)
[ ]   35    0    0   35 / 5000  7071.8s Lin Bytes test with Domain (shrinking:   57.0060)
[ ]   35    0    0   35 / 5000  7143.7s Lin Bytes test with Domain (shrinking:   58.0004)
[ ]   35    0    0   35 / 5000  7216.1s Lin Bytes test with Domain (shrinking:   58.0006)
[ ]   35    0    0   35 / 5000  7288.9s Lin Bytes test with Domain (shrinking:   58.0014)
Error: The operation was canceled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
test suite reliability Issue concerns tests that should behave more predictably
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant