Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better visibility into test failures over time #11217

Closed
andrross opened this issue Nov 15, 2023 · 16 comments
Closed

Better visibility into test failures over time #11217

andrross opened this issue Nov 15, 2023 · 16 comments
Assignees
Labels
discuss Issues intended to help drive brainstorming and decision making enhancement Enhancement or improvement to existing feature or request :test Adding or fixing a test

Comments

@andrross
Copy link
Member

andrross commented Nov 15, 2023

I've created a script that crawls the OpenSearch Jenkins builds to find test failures, but only for the Gradle checks that run on code after it is pushed to the main branch. This filters out failures that are due to unmerged code in work-in-progress PRs.

I've included below the output after crawling 2000 recent builds (approx. Oct 16 - Nov 14). This data is very hard to follow, but one thing in particular stands out: SearchQueryIT.testCommonTermsQuery is a frequently failing test, but only since build 29184 (Oct 28). There are no failures before that, which strongly suggests something was changed around Oct 28 that introduced the flakiness. I haven't started to look but I suspect we'll be able to find the cause pretty quickly given that there is a point in time to start looking at. Update Nov 16: the root cause was an unrelated change for concurrent search randomly increased the number of deleted documents and exposed some underlying brittleness in this test: #11233 Diagnosing the root cause was a bit tricky and required diving into the specifics of how the common terms query works, but it was indeed much simpler once the flakiness was correlated to a small date range and then a specific commit.

Surely there are better tools for visualizing test reports over time, perhaps already built into Jenkins? Also, we don't push that many commits so the sample size on builds after pushes to main isn't that large. Something like a nightly job to run the test suite 10 or 50 or 100 times and create a report on failures would help to quickly surface newly introduced flakiness.

$ ruby ~/flaky-test-finder-push-trigger-main.rb -s 27990 -e 29990

24 org.opensearch.indices.replication.SegmentReplicationIT.testSendCorruptBytesToReplica (28239,28239,28239,28239,28645,28645,28645,28645,28702,28702,28702,28702,28875,28875,28875,28875,28894,28894,28894,28894,28897,28897,28897,28897)
17 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/20_response_filtering/Nodes Stats with response filtering} (28276,28276,28276,28276,28278,28278,28278,28278,28765,28962,28962,28962,28962,28989,28989,28989,28989)
16 org.opensearch.repositories.s3.S3BlobStoreRepositoryTests.testRequestStats (28259,28259,28259,28259,28276,28276,28276,28276,28316,28316,28316,28316,28368,28368,28368,28368)
12 org.opensearch.search.aggregations.metrics.CardinalityWithRequestBreakerIT.testRequestBreaker {p0={"search.concurrent_segment_search.enabled":"true"}} (28051,28184,28251,28481,28502,28576,28727,28765,28766,28797,28841,28894)
9 org.opensearch.cluster.MinimumClusterManagerNodesIT.testThreeNodesNoClusterManagerBlock (28051,28576,28702,28713,28875,28897,29428,29666,29846)
9 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=cat.nodes/10_basic/Test cat nodes output} (28276,28276,28276,28276,28278,28278,28278,28278,28765)
9 org.opensearch.index.shard.RemoteIndexShardTests.classMethod (28716,28716,28897,28897,28966,28966,29666,29666,29666)
8 org.opensearch.search.aggregations.metrics.CardinalityWithRequestBreakerIT.testRequestBreaker {p0={"search.concurrent_segment_search.enabled":"false"}} (28051,28481,28576,28765,28766,28797,28841,28894)
7 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=cat.nodes/10_basic/Additional disk information} (28276,28276,28276,28276,28278,28278,28765)
7 org.opensearch.search.query.SearchQueryIT.testCommonTermsQuery {p0={"search.concurrent_segment_search.enabled":"true"}} (29184,29324,29343,29378,29506,29846,29954)
7 org.opensearch.search.query.SearchQueryIT.testCommonTermsQuery {p0={"search.concurrent_segment_search.enabled":"false"}} (29184,29324,29343,29378,29506,29846,29954)
6 org.opensearch.search.aggregations.metrics.CardinalityWithRequestBreakerIT.classMethod (28797,28797,28797,28841,28841,28841)
6 org.opensearch.cluster.service.MasterServiceTests.testClusterStateBatchedUpdates (28899,28905,28966,28989,28994,29003)
5 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/11_indices_metrics/Metric - _all} (28765,28989,28989,28989,28989)
5 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/20_response_filtering/Nodes Stats filtered using both includes and excludes filters} (28278,28278,28278,28278,28989)
5 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/30_discovery/Discovery stats} (28765,28962,28966,28989,28989)
5 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=cat.allocation/10_basic/Node ID} (28276,28276,28276,28276,28278)
4 org.opensearch.cluster.MinimumClusterManagerNodesIT.classMethod (28897,28897,28897,28897)
4 org.opensearch.action.admin.cluster.node.tasks.ResourceAwareTasksTests.testTaskResourceTrackingDuringTaskCancellation (28765,28766,29432,29508)
3 org.opensearch.index.shard.RemoteIndexShardTests.testSegRepSucceedsOnPreviousCopiedFiles (28716,28897,28966)
3 org.opensearch.remotestore.SegmentReplicationUsingRemoteStoreDisruptionIT.testCancelReplicationWhileFetchingMetadata (29070,29132,29274)
3 org.opensearch.remotestore.SegmentReplicationUsingRemoteStoreDisruptionIT.classMethod (29070,29132,29378)
3 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/11_indices_metrics/Metric - blank} (28278,28765,28962)
3 org.opensearch.remotestore.RemoteIndexRecoveryIT.testSnapshotRecovery (28481,29432,29655)
3 org.opensearch.search.SearchWeightedRoutingIT.testMultiGetWithNetworkDisruption_FailOpenEnabled (28502,29561,29666)
3 org.opensearch.indices.replication.SegmentReplicationSuiteIT.testFullRestartDuringReplication (28671,28716,29561)
3 org.opensearch.smoketest.SmokeTestMultiNodeClientYamlTestSuiteIT.test {yaml=pit/10_basic/Delete all} (28702,28875,29132)
3 org.opensearch.search.aggregations.bucket.DiversifiedSamplerIT.testNestedDiversity {p0={"search.concurrent_segment_search.enabled":"true"}} (28706,28727,29343)
3 org.opensearch.search.aggregations.bucket.DiversifiedSamplerIT.testSimpleDiversity {p0={"search.concurrent_segment_search.enabled":"true"}} (28706,28727,29343)
2 org.opensearch.remotestore.RemoteStoreClusterStateRestoreIT.testFullClusterRestoreGlobalMetadata (29595,29655)
2 org.opensearch.index.shard.RemoteIndexShardTests.testRepicaCleansUpOldCommitsWhenReceivingNew (28239,29293)
2 org.opensearch.indices.replication.SegmentReplicationSuiteIT.classMethod (28716,29561)
2 org.opensearch.search.nested.SimpleNestedIT.testSimpleNestedSortingWithNestedFilterMissing {p0={"search.concurrent_segment_search.enabled":"true"}} (28682,29508)
1 org.opensearch.search.profile.query.QueryProfilerTests.testBasic {p0=5} (29044)
1 org.opensearch.repositories.azure.AzureBlobContainerRetriesTests.testReadBlobWithRetries (29132)
1 org.opensearch.remotestore.RemoteStoreStatsIT.testDownloadStatsCorrectnessSinglePrimaryMultipleReplicaShards (29132)
1 org.opensearch.remotestore.RemoteStoreStatsIT.testNonZeroPrimaryStatsOnNewlyCreatedIndexWithZeroDocs (29132)
1 org.opensearch.index.reindex.ReindexBasicTests.testMultipleSources (29177)
1 org.opensearch.index.reindex.ReindexBasicTests.testFiltering (29177)
1 org.opensearch.repositories.azure.AzureBlobContainerRetriesTests.testReadNonexistentBlobThrowsNoSuchFileException (29184)
1 org.opensearch.action.admin.indices.create.RemoteShrinkIndexIT.testCreateShrinkIndex (29279)
1 org.opensearch.action.admin.indices.create.RemoteShrinkIndexIT.classMethod (29279)
1 org.opensearch.discovery.ClusterDisruptionIT.classMethod (29293)
1 org.opensearch.search.SearchWeightedRoutingIT.testSearchAggregationWithNetworkDisruption_FailOpenEnabled (29293)
1 org.opensearch.repositories.azure.AzureBlobContainerRetriesTests.testReadRangeBlobWithRetries (29324)
1 org.opensearch.monitor.fs.FsHealthServiceTests.testFailsHealthOnHungIOBeyondHealthyTimeout (29324)
1 org.opensearch.remotestore.SegmentReplicationUsingRemoteStoreDisruptionIT.testCancelReplicationWhileSyncingSegments (29378)
1 org.opensearch.search.query.QueryProfilePhaseTests.testTerminateAfterEarlyTermination {p0=5 p1=org.opensearch.search.query.ConcurrentQueryPhaseSearcher@521ba38f} (29417)
1 org.opensearch.action.admin.indices.create.RemoteSplitIndexIT.testCreateSplitIndex (29536)
1 org.opensearch.action.admin.indices.create.RemoteSplitIndexIT.testCreateSplitIndexToN (29536)
1 org.opensearch.action.admin.indices.create.RemoteSplitIndexIT.testSplitFromOneToN (29536)
1 org.opensearch.action.admin.indices.create.RemoteSplitIndexIT.testSplitIndexPrimaryTerm (29536)
1 org.opensearch.action.admin.indices.create.RemoteSplitIndexIT.classMethod (29536)
1 org.opensearch.search.SearchWeightedRoutingIT.testShardRoutingWithNetworkDisruption_FailOpenEnabled (29595)
1 org.opensearch.index.shard.RemoteIndexShardTests.testSegmentReplication_With_EngineClosedConcurrently (29666)
1 org.opensearch.index.shard.IndexShardTests.testCommitLevelRestoreShardFromRemoteStore (29729)
1 org.opensearch.index.translog.RemoteFsTranslogTests.testMetadataFileDeletion (28027)
1 org.opensearch.search.query.QueryProfilePhaseTests.testTerminateAfterEarlyTermination {p0=5 p1=org.opensearch.search.query.ConcurrentQueryPhaseSearcher@1d1c37d5} (29821)
1 org.opensearch.repositories.azure.AzureBlobContainerRetriesTests.testWriteLargeBlob (28051)
1 org.opensearch.search.query.QueryProfilePhaseTests.testTerminateAfterEarlyTermination {p0=5 p1=org.opensearch.search.query.ConcurrentQueryPhaseSearcher@c83ed77} (28521)
1 org.opensearch.search.SearchTimeoutIT.testSimpleTimeout {p0={"search.concurrent_segment_search.enabled":"false"}} (28576)
1 org.opensearch.remotestore.RemoteStoreStatsIT.testDownloadStatsCorrectnessSinglePrimarySingleReplica (28671)
1 org.opensearch.remotestore.multipart.RemoteStoreMultipartIT.testRestoreSnapshotToIndexWithSameNameDifferentUUID (28706)
1 org.opensearch.search.basic.SearchWithRandomIOExceptionsIT.testRandomDirectoryIOExceptions {p0={"search.concurrent_segment_search.enabled":"true"}} (28706)
1 org.opensearch.search.basic.SearchWithRandomIOExceptionsIT.classMethod (28706)
1 org.opensearch.indices.replication.SegmentReplicationSuiteIT.testBasicReplication (28716)
1 org.opensearch.indices.replication.SegmentReplicationSuiteIT.testDeleteIndexWhileReplicating (28716)
1 org.opensearch.remotestore.RemoteStoreClusterStateRestoreIT.testFullClusterStateRestore (28727)
1 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/11_indices_metrics/Metric - indexing doc_status} (28765)
1 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/50_indexing_pressure/Indexing pressure stats} (28765)
1 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/11_indices_metrics/Metric - recovery} (28765)
1 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/10_basic/Nodes stats level} (28765)
1 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/50_indexing_pressure/Indexing pressure memory limit} (28765)
1 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/11_indices_metrics/Metric - _all include_segment_file_sizes} (28765)
1 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/11_indices_metrics/Metric - multi} (28765)
1 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/11_indices_metrics/Metric - indices _all} (28765)
1 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/11_indices_metrics/Metric - one} (28765)
1 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/40_store_stats/Store stats} (28765)
1 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=cat.fielddata/10_basic/Test cat fielddata output} (28765)
1 org.opensearch.test.rest.ClientYamlTestSuiteIT.test {p0=search.aggregation/20_terms/string profiler via global ordinals} (28765)
1 org.opensearch.action.bulk.BulkIntegrationIT.testDeleteIndexWhileIndexing (28797)
1 org.opensearch.action.bulk.BulkIntegrationIT.testBulkWithWriteIndexAndRouting (28797)
1 org.opensearch.action.bulk.BulkIntegrationIT.testDocIdTooLong (28797)
1 org.opensearch.action.bulk.BulkIntegrationIT.testBulkIndexCreatesMapping (28797)
1 org.opensearch.action.bulk.BulkIntegrationIT.testBulkWithGlobalDefaults (28797)
1 org.opensearch.search.functionscore.DecayFunctionScoreIT.classMethod (28813)
1 org.opensearch.snapshots.DedicatedClusterSnapshotRestoreIT.testIndexDeletionDuringSnapshotCreationInQueue (28841)
1 org.opensearch.repositories.azure.AzureBlobStoreRepositoryTests.testMultipleSnapshotAndRollback (28875)
1 org.opensearch.client.PitIT.testDeleteAllAndListAllPits (28899)
1 org.opensearch.repositories.azure.AzureBlobStoreRepositoryTests.testContainerCreationAndDeletion (29044)
@andrross andrross added enhancement Enhancement or improvement to existing feature or request discuss Issues intended to help drive brainstorming and decision making untriaged labels Nov 15, 2023
@peternied peternied added :test Adding or fixing a test and removed untriaged labels Nov 30, 2023
@andrross
Copy link
Member Author

The request here is very similar to this older issue: #3713

@peternied
Copy link
Member

@andrross I created a repo [1] that collects project health information and published reports to its repo ever day - see the latest reports at https://github.com/peternied/contribution-rate?tab=readme-ov-file#reports

One such report is a last 30 days top failing test - here is the March 8 report. It will keep updating every day. Feel free to contribute any kind of reports you'd like to see.

@prudhvigodithi
Copy link
Member

prudhvigodithi commented May 10, 2024

Hey @andrross @peternied we now have the gradle metrics published to OpenSearch Gradle Check Metrics dashboard, this is part of surfacing opensearch-metrics to community. Please check the current supported metrics. How about we expand this and add more metrics as required? and also we use this data for creating triggers like GitHub issues, comments etc.
@bbarani @Pallavi-AWS @dblock

@andrross
Copy link
Member Author

@prudhvigodithi Do you think it would make sense to add some details about using the gradle check metrics dashboard to help investigate and fix flaky failures in either TESTING.md or DEVELOPER_GUIDE.md?

@dreamer-89 created a great list in #3713:

  1. Identify top hitter for prioritization.
  2. Identify commit introduced a flaky test or increase freq of existing test failure.
  3. Build failure trend to identify health of software.
  4. Developers impacted due to flaky tests.
  5. Test history.

I think if we document somewhere how to use the new dashboard to solve those problems then we can close both of these issues as completed.

@prudhvigodithi
Copy link
Member

prudhvigodithi commented May 30, 2024

Thanks @andrross and @dreamer-89, based on the list you have I have modified the gradle check workflow with new fields and created some new visualizations based on the indexed data (Thanks to @rishabh6788 for setting up the initial flow), please check the link OpenSearch Gradle Check Metrics.

Identify top hitter for prioritization

For this I have created a pie chart with the top test_class that has the majority of the failures, this chart should also have the top failing tests within this test_class, we can further slice and dice the data for getting the list of PR's and owners or with post merge, that has top failing tests upon filter.

Screenshot 2024-05-30 at 1 32 04 PM

Identify commit introduced a flaky test or increase freq of existing test failure

The following data tables should have the git commit, the associated PR and the PR owner with all the failing test details, we can filter per PR or commit to get the details of failed tests. The new visualization Gradle Check - Top test class failures with Post Merge also has the flaky test information, its associated commitID and PR (with owner) that was merged with this commitID with post merge (gradle check that ran after the PR is merged) action. We should be able to further drill down with test name or the test class name for more details.

Screenshot 2024-05-30 at 1 09 56 PM
Screenshot 2024-05-30 at 1 10 09 PM

Build failure trend to identify health of software

For this the dashboard has a TSVB and line chart with the trend for the failure tests, this can be again further filtered with test name, test class, commitID, PR and with executions with Post merge.

Screenshot 2024-05-30 at 1 14 43 PM

Screenshot 2024-05-30 at 1 14 58 PM

Developers impacted due to flaky tests

The entire visualizations can be filtered with PR owner, PR number or commitID. The results has the hyperlinks for the GitHub PR or commit where one can see the comments and other users. The dashboards also has the PR owner attached to see impacted user. The visualizations also has the hyperlinks with the jenkins build data where one can see all the stack trace details for the failed tests (example 39487).

Screenshot 2024-05-30 at 1 20 22 PM

Test history

All the visualizations in dashboard can be filtered by date range, using OpenSearch we get this out of the box :)
With this we can go back and see the trends and infer results based on it.

Screenshot 2024-05-30 at 1 23 52 PM

Adding @peternied @getsaurabh02 @dblock @Pallavi-AWS @reta

@reta
Copy link
Collaborator

reta commented May 31, 2024

@prudhvigodithi @rishabh6788 it looks great, thank you so much folks for putting it all together

@andrross
Copy link
Member Author

@prudhvigodithi @rishabh6788 it looks great, thank you so much folks for putting it all together

Agreed, this is awesome!

@prudhvigodithi
Copy link
Member

prudhvigodithi commented May 31, 2024

Thanks @reta and @andrross, I have a PR created with some details added to the DEVELOPER_GUIDE.md regarding this dashboard #13919, please check.

@prudhvigodithi
Copy link
Member

Next step moving forward for surfacing the test failures as GitHub Issues instead of creating a very generic issue like #13893 (coming from https://github.com/opensearch-project/OpenSearch/blob/main/.github/workflows/gradle-check.yml#L161-L168) which sometimes fails to execute https://github.com/opensearch-project/OpenSearch/actions/runs/9320653340/job/25657907035, how about we use the following data table information to create a GitHub issue.

Screenshot 2024-05-31 at 2 53 08 PM

Here is the example: After finding the failed tests from Post Merge Actions

We should start by creating an issue at a test class level NestedQueryBuilderTests, link and keep updating all the commits and PR information to the issue created for NestedQueryBuilderTests.

1st to the issue created for NestedQueryBuilderTests, we can link all the post merge failures and commits.
Screenshot 2024-05-31 at 2 58 01 PM

2nd on the same issue for NestedQueryBuilderTests, we can add the failed tests which are part of NestedQueryBuilderTests and Jenkins build information for stacktrace.
Screenshot 2024-05-31 at 3 00 09 PM

3rd on the same issue, we can add other PR's information where this has or has been failing.
Screenshot 2024-05-31 at 3 00 26 PM

I'm open for ideas on whom to assign this created issue? Should we just keep it open without any assignee as each issue will have multiple PR and commits information. later during triaging the maintainer should be able to identify the right team/user and add as assignee.

Moving forward we can have a logic to auto close the created issue if in last 30 days there is no failure for the test class (NestedQueryBuilderTests in above example) found in post merge Gradle Check build and reopen as required.

@andrross @reta @dblock @getsaurabh02 @peternied let me know your thoughts on this.

Thank you

@andrross
Copy link
Member Author

andrross commented Jun 1, 2024

Now that we have the metrics and the updated developer guide, I'm going to close this and issue #3713. If anyone thinks there is more to do here please reopen or open a new issue. Thanks!

@andrross andrross closed this as completed Jun 1, 2024
@dblock
Copy link
Member

dblock commented Jun 3, 2024

This is great!

Next step moving forward for surfacing the test failures as GitHub Issues instead of creating a very generic issue

I can't wait for this. It's something developers spend a lot of time on.

I'm open for ideas on whom to assign this created issue? Should we just keep it open without any assignee as each issue will have multiple PR and commits info

My 0.02c:

  1. Open an issue if it doesn't exist (saves everyone a ton of time).
  2. Comment on both the PR where gradle check failed and the issue. This way when we look at a flaky test we know how often it fails.
  3. No need to auto assign anything to anyone. I find this super annoying today because when I merge someone's PR I am auto-assigned a gradle build failure that I cannot do anything about.

@andrross
Copy link
Member Author

andrross commented Jun 3, 2024

@prudhvigodithi I agree with @dblock, no need to auto assign the created issues.

@prudhvigodithi
Copy link
Member

Thanks @dblock,

  1. Open an issue if it doesn't exist (saves everyone a ton of time).

The idea is to open an issue (update if already exists) for each test class failure which failed on Post Merge Actions. The Post Merge Action failures are for sure the flaky ones. Dont want to keep creating the issues (at least initially) on every failed test on PR creation as the failures can be legit for a PR.

2. Comment on both the PR where gradle check failed and the issue. This way when we look at a flaky test we know how often it fails.

Since the Post Merge Action Gradle check is executed after the PR is merged, the suggestion here is to comment on the closed PR and link that back to the Issue created, with this we can have datapoints of the PR's added to the issue, is my understanding correct here @dblock @andrross ?

3. No need to auto assign anything to anyone. I find this super annoying today because when I merge someone's PR I am auto-assigned a gradle build failure that I cannot do anything about.

Make sense.

Thank you

@dblock
Copy link
Member

dblock commented Jun 3, 2024

The idea is to open an issue (update if already exists) for each test class failure which failed on Post Merge Actions. The Post Merge Action failures are for sure the flaky ones. Dont want to keep creating the issues (at least initially) on every failed test on PR creation as the failures can be legit for a PR.

Makes sense. What would be massively useful is to link existing flaky test issues when they fail in PRs. So maybe this could be the action run for every PR gradle check failure:

  1. For any failed test, lookup an existing flaky test issue, if it exists, comment.
  2. For any new failure, highlight it in comments with something like "new flaky test? please check and open one manually".

Since the Post Merge Action Gradle check is executed after the PR is merged, the suggestion here is to comment on the closed PR and link that back to the Issue created, with this we can have datapoints of the PR's added to the issue, is my understanding correct here @dblock @andrross ?

I think this is unnecessary because the PR most definitely didn't cause the flaky test and once merged nobody is going to be looking at it. I recommend doing (1) and (2) above.

@prudhvigodithi
Copy link
Member

Since this issue is closed I have created a new issue #13950 (comment) for this topic of surfacing the flaky tests as github issues and we continue our discussion there.
Thanks
@getsaurabh02 @dblock @andrross @reta

@msfroh
Copy link
Collaborator

msfroh commented Jun 3, 2024

@prudhvigodithi, I just checked out the dashboard for the first time. It is amazing!

There is a little noise from cases where the open PR introduced failures. For example, looking at the last 7 days, it looks like ClientYamlTestSuiteIT is buggy, but most of that is coming from one build (https://build.ci.opensearch.org/job/gradle-check/39413/testReport/).

I added a new test that was failing due to a type mismatch, so I tried modifying the test framework. I fixed my test, but broke 1000+ other tests. (That fix obviously didn't get merged, but I accidentally skewed the statistics.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss Issues intended to help drive brainstorming and decision making enhancement Enhancement or improvement to existing feature or request :test Adding or fixing a test
Projects
None yet
Development

No branches or pull requests

6 participants