Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing test flakiness #2242

Merged
merged 2 commits into from
Apr 13, 2021
Merged

Fixing test flakiness #2242

merged 2 commits into from
Apr 13, 2021

Conversation

sleepy-owl
Copy link
Contributor

The test test_update_envs_env_update is flaky. It sometimes fails more than twice. The assertion on line 78 assert np.var(mean_rewards) > 0 fails. This PR addresses this issue.

To find a solution, I collected samples of np.var(mean_rewards) from several test executions and computed the tail distribution just to check how often can the value be 0.

Based on the collected samples, it seems there is ~12% chance that the test will fail. I suggest to run this test 3 times, then the probability of failure will be <1%.

I think refactoring this test using the statistical evaluation might be a good way to reduce the flakiness of this test.

Do you guys think this makes sense? Please let me know if this looks good or if you have any other suggestions. Also, here I assume there are no bugs in the code under test.

Also, I am curious to know why do you check if var > 0? Statistically, zero variance can happen sometimes. Is it better to have something like assert np.mean(mean_rewards) > 0?

@sleepy-owl sleepy-owl requested a review from a team as a code owner February 24, 2021 03:57
@sleepy-owl sleepy-owl requested review from ryanjulian and removed request for a team February 24, 2021 03:57
@mergify mergify bot requested review from a team, yeukfu and ziyiwu9494 and removed request for a team February 24, 2021 03:58
@ryanjulian
Copy link
Member

/ok-to-test

@rlworkgroupbot
Copy link
Collaborator

Command run output for 35fd206

1 similar comment
@rlworkgroupbot
Copy link
Collaborator

Command run output for 35fd206

@sleepy-owl
Copy link
Contributor Author

hi, it seems the checks failed due to some config issue?

@ryanjulian
Copy link
Member

@ziyiwu9494 can you PTAL?

@krzentner
Copy link
Contributor

/ok-to-test

@rlworkgroupbot
Copy link
Collaborator

Command run output for 35fd206

@sleepy-owl
Copy link
Contributor Author

Hi all, thanks for approving the PR. Is there something blocking for merge?

@mergify mergify bot requested a review from a team March 15, 2021 07:23
@mergify mergify bot requested a review from a team March 15, 2021 07:24
@avnishn
Copy link
Member

avnishn commented Mar 18, 2021

/ok-to-test

@rlworkgroupbot
Copy link
Collaborator

Command run output for e0f1501

@mergify mergify bot requested a review from a team March 19, 2021 19:04
@krzentner krzentner merged commit 90b6090 into rlworkgroup:master Apr 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants