Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve run time of coordinator duty MarkAsUnusedOvershadowedSegments #13287

Merged
merged 3 commits into from
Nov 1, 2022

Conversation

kfaraz
Copy link
Contributor

@kfaraz kfaraz commented Oct 29, 2022

In clusters with a large number of segments, the duty MarkAsUnusedOvershadowedSegments
can take a long very long time to finish. This is because of the costly invocation of
timeline.isOvershadowed which is done for every used segment in every coordinator run.

This is extravagant as there are only a few overshadowed segments which are already identified
as part of the datasource snapshot. In clusters with ~500k segments, this duty can take several
minutes to finish even when no segment is actually overshadowed.

Changes

  • Use DataSourceSnapshot.getOvershadowedSegments to get all overshadowed segments
  • Iterate over this set instead of all used segments to identify segments that can be marked as unused
  • Mark segments as unused in the DB in batches rather than one at a time
  • Refactor: Add class SegmentTimeline for ease of use and readability while using a
    VersionedIntervalTimeline of segments.

Notes

The changes here provide significant improvement in the run time of this duty.
The number of overshadowed segments in a given coordinator run is expected to be small.
Even in the rare bad case scenario (half of all used segments are overshadowed), this change
would halve the run time of the duty. Subsequent runs would again take negligible time.
(Absolute worst case scenario would be a single segment overshadowing everything 😅)

Further improvements can be made to the run time of this duty but that would require changes to the
logic of timeline.isOvershadowed and would provide only minor time reductions.


This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

@kfaraz kfaraz merged commit fd7864a into apache:master Nov 1, 2022
@kfaraz kfaraz deleted the fix_mark_unused_duty branch November 1, 2022 14:49
@kfaraz
Copy link
Contributor Author

kfaraz commented Nov 1, 2022

Thanks for the reviews, @imply-cheddar , @AmatyaAvadhanula !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants