Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ember Sweep has occasional Time Limit failures #274

Closed
jpvandy opened this issue May 2, 2016 · 21 comments
Closed

Ember Sweep has occasional Time Limit failures #274

jpvandy opened this issue May 2, 2016 · 21 comments

Comments

@jpvandy
Copy link
Contributor

jpvandy commented May 2, 2016

This issue is about Time Limit
Issue #147 focuses on a low probability failure that results from an ASSERT about a buffer length.
Issue #108 is about a file naming issue that affect Scheduler Detailed Network test.

EmberSweep tests also very occasionally go time limit (900 sec.) on a test expected to run a few seconds. This morning's Nightly had two such events. Test #112 on Yosemite, got a time limit this morning and test #57 went time limit on COE-RedHat-7. EmberSweep contains 180 tests. Time Limit failures provide no supplemental information on what went wrong.

@jpvandy
Copy link
Contributor Author

jpvandy commented May 9, 2016

Add a time limit of test #45 on Yosemite in the May 8th AM nightly (memH_no_openMP_Yosemite)

@jpvandy
Copy link
Contributor Author

jpvandy commented May 10, 2016

Add a time limit of test #15 on RHEL-6 multi thread=2 May 10th AM nightly.

@jpvandy
Copy link
Contributor Author

jpvandy commented May 16, 2016

Time limit also occurred test #45 on May 15th on the CentOS 6.6 VM.

@jpvandy
Copy link
Contributor Author

jpvandy commented Jun 10, 2016

June 10th nightly: ES went time limit on test #82 (900 seconds) on Ubuntu 14.04 VM

@jpvandy
Copy link
Contributor Author

jpvandy commented Jun 20, 2016

June 19th nightly: ES went time limit on test 104 (900 seconds) on RHEL-6 multi-thread=2

@jpvandy
Copy link
Contributor Author

jpvandy commented Jun 24, 2016

June 23rd, ES went time limit on test 104 on Ubuntu Ariel tester, which run ES at multi-thread=2

@jpvandy jpvandy changed the title Ember Sweep has a second failure mode for occasional failures Ember Sweep has occasional Time Limit failures Jun 24, 2016
@jpvandy
Copy link
Contributor Author

jpvandy commented Jun 27, 2016

June 26th, ES had a time limit on test 59. This was on sst-test mainline, hence NOT multi

@jpvandy
Copy link
Contributor Author

jpvandy commented Jun 29, 2016

June 29th ES had a time limit fault on test #29. This was on sst-test, multithread = 4

@jpvandy
Copy link
Contributor Author

jpvandy commented Jun 30, 2016

June 30th ES had a time limit fault on test #69 This was on Yosemite without multi.

@jpvandy
Copy link
Contributor Author

jpvandy commented Jul 13, 2016

July 13, ES had a time limit fault on test #6. This was on Yosemite without multi on the test from Master.

@jpvandy
Copy link
Contributor Author

jpvandy commented Jul 18, 2016

On the 6.0.0_Pre branch on July 14th, ES had time limit faults on tests 32 and 68 on sst-test, multiThread = 4
On the regular nightly, ES had a time limit fault on test 28 on sst-test, multiRank=4.

@jpvandy
Copy link
Contributor Author

jpvandy commented Jul 18, 2016

To emphasize the "occasional" nature of this failure:
On July 15th on sst-test multiThread=4, the nightly had time limits on three tests from EmberSweep, #1, #87 and #118 and on the 6.0.0_Pre branch there were two, test 88 and test 100.
There were no such failures on the 16th and the 17th.
On the nightly on the 18th there were two such time limits, on tests 76 and 140. This is all on sst-test and multiThread=4.

@allevin allevin modified the milestones: SST v6.1.0, SST v6.0.0 Jul 18, 2016
@jpvandy
Copy link
Contributor Author

jpvandy commented Aug 1, 2016

Skip in recording these. July 31, multiThread, #98 got a timelime

@jpvandy
Copy link
Contributor Author

jpvandy commented Aug 2, 2016

August 2nd: Not multi thread! Test #163 on Master Yosemite and test #42 on mainline Yosemite.

@jpvandy
Copy link
Contributor Author

jpvandy commented Aug 8, 2016

August 8th Multi-thread=2 Test #109 on sst-test Ariel tester

@jpvandy
Copy link
Contributor Author

jpvandy commented Aug 13, 2016

August 13th, Multi-thread=2, test #164 on Ubuntu Ariel tester.
So much for the hope that the recent ember changes to make Valgrind happy would cure this occasional problem.

@jpvandy
Copy link
Contributor Author

jpvandy commented Nov 14, 2016

Stopped logging ES time limits at some point.
Several recently.
November 5, Multi-thread=4, tests 88 and 121
November 12, MT = 4, test 47
November 14, MT=4, test 30.
November 14, COE-RHEL-7, test 6

@nmhamster
Copy link
Contributor

@jpvandy - can we get an update on this bug please for v6.1 release?

@jpvandy
Copy link
Contributor Author

jpvandy commented Jan 9, 2017

Haven't noticed any in a while, however,
a timelimit occurred January 9th on sst-test, multithread=4, EmberSweep test 78
@nmhamster

@jpvandy
Copy link
Contributor Author

jpvandy commented Jan 9, 2017

January 9th also a time limit on 6.1_beta, El Capitan, Xcode 8, EmberSweep test 18.

NOTE: this is not Multi-Thread or Multi-Rank!
@nmhamster

@jpvandy
Copy link
Contributor Author

jpvandy commented May 12, 2017

I observe that the changes that went into Release 7.0 have significantly reduced the probability of Ember Sweep time limit. The nightly report as caught two this month. I am going close this issue so we can ignore this long history. But I will open a new Issue to keep track of the much less frequent failures. EmberSweep consists of 180 tests. The Suite is run more than a dozen times in the nightly. Two events seen in half a month is relatively low probability.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants