-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Erratic test failure in master #7249
Comments
I think this one might just be slow, it passes in airplane mode on my Macbook Pro but times out with-or-without a network connection on my N2920-based netbook. [EDIT] After a cursory glance at the spec, it also doesn't look like it's requesting anything. |
I just had a build fail on Travis I believe for a similar reason as above:
Restarting the build fixed it. |
Master just failed again https://travis-ci.org/AnalyticalGraphicsInc/cesium/builds/474883290?utm_source=github_status&utm_medium=notification We need to nip this in the bud. @OmarShehata since you've been doing some test clean-up/optimization, do you think you can take a look at this? We worked hard to get CI consistently passing and now all of a sudden it's failing every dozen builds. |
@mramato yeah I'll take a look at this today or tomorrow. |
I wonder/hope there's some underlying issue here. It seems a lot less table than it was a week ago. This just failed on Travis:
But it wasn't a timeout (expected 0 to 1). |
@OmarShehata any update here? |
Sorry, this got pushed down my to do list with all the recent shuffles. I'm running some Travis builds now to see how consistent the failing tests are. If there are any recent failed builds like that whose logs still exist please post them here. |
I'm able to replicate a subset of these failures by running them on a slower machine (and they pass fine when re-running the tests). I don't see an easy way around this other than refactoring each one of those tests to make it faster. Although it's still unclear why:
Cranking the default timeout up to 20 is a band-aid solution that will make these timing spikes not stop the build until we refactor them. I can open a PR here and re-run the build a bunch of times to test that new timeout. Running it locally I haven't seen anything above ~10 seconds. |
Well, for VideoSynchronizer, network latency may be at play (those tests should probably use a local, in-repository video instead of an externally hosted one). Please write up an issue. That doesn't explain the other tests if they are not using external resources. Do the other tests ever vary wildly when run in isolation or only when all ran together? |
I'm running more tests to answer that, but just documenting a peculiar finding: GroundPrimitive/does not render whens how if false, just failed at the first expect:
The value was 2 instead of 1. This test does not seem to be relying on any external resources, and it was not a timeout, so just increasing the timeout wouldn't fix this intermittent failure. It failed when running all the tests together, so far hasn't failed when running in isolation. |
The most common failure seems to be this set of tests #7249 (comment), and what's common there is that they all use My best guess now is that something goes wrong with the
If |
It took 2 re-builds for this to pass on Travis #7476 The first time, this failed #7249 (comment) and the second time, this did #7249 (comment). Both of those are non-timeout tests. The only thing in common amongst all these failures is they rely on something happening in Each Spec file creates a scene, which creates a new canvas, but it does get cleaned up, and it should be running even on failure. |
We were already using FirefoxHeadless for built tests before this happened, so I doubt that's it. |
@OmarShehata I think you are on to something and it would make sense that the delay is really the test waiting for something that never happens. Any idea why this is happening? CC @lilleyse |
I've been seeing this one fail a lot:
Which is very interesting because the two camera events tests, right before and right after it, are very stable. @lilleyse suggested refactoring this test (for example, it calls render twice, perhaps because the event doesn't always get triggered with a single render, which seems quite unreliable). That's something I'll try next that'll hopefully get a good chunk of the failures to disappear. |
Pretty sure #7482 fixed most of the stability issues except for the cameraMoveEvent that @OmarShehata mentions above. @OmarShehata have you seen crashes since #7482 was merged? (I forced master a few times and didn't). I have seen the cameraMoveEvent one though, so it might good to fix or ignore that one ASAP. (if we ignore it, write up a new clean issue so we don't forget about it) |
Just saw a failure today with this same sequence: #7249 (comment) |
This seems to be getting worse. I'm seeing travis fail with random tests more often than I see it pass. |
I think we need to experiment with bumping the test timeout to a large value (like 30 seconds) when running on travis and see if that helps improve things. travis' infrastructure seems to have gotten work and their servers are getting slower and slower. |
There's a new erratic failure discovered in #7713
The exact failures are:
|
I'm starting to get the failures above locally (but only when running all the tests) which is good news, I might be able to look into and get a fix for this now. |
I've been seeing a new error lately: Once in #8242 and again in #8205. Restarting the build usually makes CI pass again. Build details: https://travis-ci.org/AnalyticalGraphicsInc/cesium/builds/592619357?utm_source=github_status&utm_medium=notification |
After some CI fixes and the ES6 migration, the only consistent CI failure I'm seeing is with Vector3DTileContent These tests look overly complicated and have a lot of setup, so I'm leaning towards the specs being the problem: https://travis-ci.org/AnalyticalGraphicsInc/cesium/builds/602749111 |
Ignored Vector3DTileContent specs for now in #8317 until we have time to look at them. |
As of fd66f69, there's a bunch of test failures in the CI seems to have passed though. Moreover, it seems to me that each spec is trying to do too much. We're testing for |
Do you see the same when running with
This sounds like the best approach to me. Specs should only test one thing, and re-writing the specs seems to be the consensus in thread anyway. |
The last master build failed with the below error. If I had to guss, this test is using an external server. We should probably bring whatever data its using in-house to make sure CI doesn't randomly fail.
CC @likangning93 @OmarShehata
The text was updated successfully, but these errors were encountered: