Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filter shards for sliced search at coordinator #16771

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

msfroh
Copy link
Collaborator

@msfroh msfroh commented Dec 3, 2024

Description

Prior to this commit, a sliced search would fan out to every shard, then apply a MatchNoDocsQuery filter on shards that don't correspond to the current slice. This still creates a (useless) search context on each shard for every slice, though. For a long-running sliced scroll, this can quickly exhaust the number of available scroll contexts.

This change avoids fanning out to all the shards by checking at the coordinator if a shard is matched by the current slice. This should reduce the number of open scroll contexts to max(numShards, numSlices) instead of numShards * numSlices.

Related Issues

Related to #16289

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Copy link
Contributor

github-actions bot commented Dec 3, 2024

❌ Gradle check result for 2d2fd05: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 541979e: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for b4aaa2f: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 1680b9b: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@msfroh msfroh force-pushed the filter_shards_by_slice branch from 1680b9b to 8842514 Compare December 11, 2024 00:36
@msfroh msfroh added backport 2.x Backport to 2.x branch enhancement Enhancement or improvement to existing feature or request Search:Resiliency labels Dec 11, 2024
Copy link
Contributor

❌ Gradle check result for 8842514: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@msfroh msfroh force-pushed the filter_shards_by_slice branch from 8842514 to eadaabd Compare December 11, 2024 05:07
Copy link
Contributor

❌ Gradle check result for eadaabd: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

✅ Gradle check result for eadaabd: SUCCESS

Copy link

codecov bot commented Dec 12, 2024

Codecov Report

Attention: Patch coverage is 73.33333% with 12 lines in your changes missing coverage. Please review.

Project coverage is 72.21%. Comparing base (4a53ff2) to head (41d0f33).
Report is 17 commits behind head on main.

Files with missing lines Patch % Lines
...min/cluster/shards/ClusterSearchShardsRequest.java 58.33% 3 Missing and 2 partials ⚠️
...n/admin/cluster/RestClusterSearchShardsAction.java 0.00% 5 Missing ⚠️
...pensearch/action/search/TransportSearchAction.java 75.00% 1 Missing ⚠️
...java/org/opensearch/search/slice/SliceBuilder.java 88.88% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #16771      +/-   ##
============================================
- Coverage     72.32%   72.21%   -0.11%     
+ Complexity    65310    65284      -26     
============================================
  Files          5299     5299              
  Lines        303534   303561      +27     
  Branches      43941    43954      +13     
============================================
- Hits         219527   219225     -302     
- Misses        66021    66383     +362     
+ Partials      17986    17953      -33     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@msfroh msfroh marked this pull request as ready for review December 12, 2024 19:47
Copy link
Contributor

❌ Gradle check result for c7abf62: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for f29e111: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 6a2de32: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for f29e111: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

✅ Gradle check result for 6a2de32: SUCCESS

msfroh added 4 commits January 6, 2025 16:00
Prior to this commit, a sliced search would fan out to every shard,
then apply a MatchNoDocsQuery filter on shards that don't correspond
to the current slice. This still creates a (useless) search context
on each shard for every slice, though. For a long-running sliced
scroll, this can quickly exhaust the number of available scroll
contexts.

This change avoids fanning out to all the shards by checking at the
coordinator if a shard is matched by the current slice. This should
reduce the number of open scroll contexts to max(numShards, numSlices)
instead of numShards * numSlices.

Signed-off-by: Michael Froh <[email protected]>
Signed-off-by: Michael Froh <[email protected]>
@msfroh msfroh force-pushed the filter_shards_by_slice branch from 6a2de32 to a5bc4ca Compare January 7, 2025 00:01
Copy link
Contributor

github-actions bot commented Jan 7, 2025

✅ Gradle check result for a5bc4ca: SUCCESS

}
List<ShardIterator> allShardIterators = new ArrayList<>();
for (List<ShardIterator> indexIterators : shardIterators.values()) {
if (slice != null) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you don't need to check for (slice != null) for every iteration, may be something along these lines:

        if (slice != null) {
            for (List<ShardIterator> indexIterators : shardIterators.values()) {
                // Filter the returned shards for the given slice
                CollectionUtil.timSort(indexIterators);
                // We use the ordinal of the iterator in the group (after sorting) rather than the shard id, because
                // computeTargetedShards may return a subset of shards for an index, if a routing parameter was
                // specified. In that case, the set of routable shards is considered the full universe of available
                // shards for each index, when mapping shards to slices. If no routing parameter was specified,
                // then ordinals and shard IDs are the same. This mimics the logic in
                // org.opensearch.search.slice.SliceBuilder.toFilter.
                for (int i = 0; i < indexIterators.size(); i++) {
                    if (slice.shardMatches(i, indexIterators.size())) {
                        allShardIterators.add(indexIterators.get(i));
                    }
                }
            }
        } else {
            shardIterators.values().forEach(allShardIterators::addAll);
        }

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this change, we avoid creating unnecessary long-lived search contexts on every shard, but you're complaining about unnecessary (extremely cheap) null checks in a tight loop?!

This is what makes you awesome, @reta! I love how you say, "Yeah, this is cool, but couldn't it be better?"

Signed-off-by: Michael Froh <[email protected]>
Copy link
Contributor

✅ Gradle check result for 41d0f33: SUCCESS

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch enhancement Enhancement or improvement to existing feature or request Search:Resiliency
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

2 participants