Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up duty for non-overlapping eternity tombstones #15281

Merged
merged 32 commits into from
Dec 11, 2023

Conversation

abhishekrb19
Copy link
Contributor

@abhishekrb19 abhishekrb19 commented Oct 30, 2023

This PR adds a new duty to clean up non-overlapping eternity tombstones. The candidate segments must fit all the criteria:

  • It is a tombstone that starts at -INF or ends at INF and
  • It does not overlap with any used overshadowed segment in the snapshot
  • It has has 0 core partitions. Candidate tombstones with 1 partition generated prior to Make numCorePartitions as 0 for tombstones #15379 need to be manually cleaned up as it can cause data loss if removed.

The algorithm to determine these tombstones is as follows:

  • Determine the set of used and non-overshadowed segments from the used segments' timeline
  • For each such candidate segment that is a tombstone that starts at -INF or ends at INF, look at the set of overshadowed used segments to see if at least one of the intervals overlaps with the candidate segment.
  • If there is no overlap, then the candidate segment is a segment eligible to be marked as unused

This is a companion change to #15243.

Release note

A new duty, MarkEternityTombstonesAsUnused, is added to clean up non-overlapping eternity tombstones - tombstone segments that either start at -INF or end at INF and has no overlap with any overshadowed used segment in the datasource. A metric segment/unneededEternityTombstone/count is emitted that counts the number of dropped non-overshadowed eternity tombstones per datasource.


Key changed/added classes in this PR
  • MarkEternityTombstonesAsUnused.java
  • MarkEternityTombstonesAsUnusedTest.java
  • Misc typos fixed in d3adbf5

This PR has:

  • been self-reviewed.
  • [] added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • been tested in a test Druid cluster.

@abhishekrb19 abhishekrb19 force-pushed the cleanup_dangling_segments branch from 0fd8732 to d89baf3 Compare October 31, 2023 02:47
@AmatyaAvadhanula AmatyaAvadhanula self-requested a review October 31, 2023 05:35
Copy link
Contributor

@kfaraz kfaraz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes, @abhishekrb19 ! I think we need some more clarity on what qualifies as a "dangling tombstone" and why and when we should clean them up.

docs/design/coordinator.md Outdated Show resolved Hide resolved
usedNonOvershadowedSegment.getInterval()
.overlaps(usedSegment.getInterval())
);
if (!overlaps) {
Copy link
Contributor

@kfaraz kfaraz Nov 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So a dangling tombstone is one that is an infinity tombstone and DOES NOT overlap with another infinity tombstone? Shouldn't there be some more (general) conditions here?

Copy link
Contributor Author

@abhishekrb19 abhishekrb19 Nov 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's one that does not overlap with any overshadowed used segment in the snapshot. Updated javadocs.


final Optional<Set<DataSegment>> usedNonOvershadowedSegments =
Optional.fromNullable(usedSegmentsTimeline)
.transform(timeline -> timeline.findNonOvershadowedObjectsInInterval(
Copy link
Contributor

@kfaraz kfaraz Nov 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to call findNonOvershadowed here or should we just use the set of overshadowed segments captured in the DataSourcesSnapshot?

Copy link
Contributor Author

@abhishekrb19 abhishekrb19 Nov 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes, I considered just the overshadowed set of segments (please see c5fb0e9), but realized that would not provide us a complete guarantee as there can be a partial overlap with a used segment that's not overshadowed, so we'd need to look at the larger set of findNonOvershadowed segments - added a UT for one such scenario org.apache.druid.server.coordinator.duty.MarkDanglingSegmentsAsUnusedTest#testDanglingTombstonesWithPartiallyOverlappingUnderlyingSegment.

EDIT: actually, just the set of overshadowed segments would suffice. The partially overlapping segments I was thinking of would still continue to exist

@LakshSingla
Copy link
Contributor

LakshSingla commented Nov 7, 2023

Dangling tombstones are tombstones that have eternity as their start or end

Why is the eternity required to be one of the endpoints per the definition? Can't a tombstone be "useless", even without that restriction?

@kfaraz
Copy link
Contributor

kfaraz commented Nov 8, 2023

Why is the eternity required to be one of the endpoints per the definition? Can't a tombstone be "useless", even without that restriction?

@LakshSingla , I suppose the assumption is that tombstones that do not have infinity as an end-point would become useless only when they get overshadowed by newer segments. So they would be removed by the existingMarkOvershadowedSegmentsAsUnused anyway and probably don't need to qualify the new definition of "dangling".

But I agree that there are nuances here which need to be called out clearly.

@LakshSingla
Copy link
Contributor

@kfaraz
Consider that there's a segment 2023-11-01/2023-11-01 with data. MSQ runs a replace query which cleans up the data from this interval. Therefore, it'd be replaced with a tombstone of granularity 2023-11-01/2023-11-01. The next time MarkOvershadowedSegmentAsUnused runs, the segment containing the data will be marked unused. Once it gets dropped, there's no need for the tombstone to be in the system. Therefore, I think the corresponding tombstone should also be cleaned up.

@abhishekrb19
Copy link
Contributor Author

abhishekrb19 commented Nov 9, 2023

Thanks for the reviews!

Why is the eternity required to be one of the endpoints per the definition? Can't a tombstone be "useless", even without that restriction?

Good question - I included a rationale in the javadocs. In short, after some discussions, we're treating tombstones as zero-row segments, so they exist, but contain no data. These infinite-interval tombstones don't honor the provided segment granularity (because we cannot generate infinite segments per time grain in the timeline) and are inconvenient to append data to afterwards. So these are the the only candidate segments considered by the coordinator duty to be marked as unused for now. The finite-interval tombstones honor the provided segment granularity with #15243.

@AmatyaAvadhanula
Copy link
Contributor

@abhishekrb19 is the following scenario possible?

  1. A dangling tombstone is created
  2. The clean up duty begins running on a snapshot
  3. Data is appended to the interval of the dangling tombstone with native batch / kafka append.
  4. The clean up duty drops the dangling tombstone.
  5. The appended data is not available because the core partition (dangling tombstone) has been dropped

@abhishekrb19
Copy link
Contributor Author

abhishekrb19 commented Nov 15, 2023

@AmatyaAvadhanula, yes, I think what you're describing is possible if the datasource snapshot doesn't have information about the new appended segment from a concurrent task.
Perhaps the duty could acquire locks and/or find the locked intervals from the overlord for datasources that contain dangling tombstone and skip marking as unused if there's a higher priority lock (similar to what was done for streaming + concurrent compaction)?

EDIT: Instead we went with #15379 that changed the number of core partitions of tombstone shard specs from 1 to 0. That should also address the concurrent append/replace scenario where an appended segment will exist independently of the tomstone.

@abhishekrb19 abhishekrb19 marked this pull request as draft November 21, 2023 02:27
@abhishekrb19 abhishekrb19 marked this pull request as ready for review November 21, 2023 08:05
@abhishekrb19 abhishekrb19 requested a review from kfaraz November 21, 2023 08:08
@abhishekrb19 abhishekrb19 changed the title Clean up duty for dangling tombstones Clean up duty for non-overlapping eternity tombstones Dec 5, 2023
Copy link
Contributor

@AmatyaAvadhanula AmatyaAvadhanula left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left minor comments. LGTM overall!

Copy link
Contributor

@AmatyaAvadhanula AmatyaAvadhanula left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thank you @abhishekrb19

@abhishekrb19
Copy link
Contributor Author

Thank you for the reviews!

@abhishekrb19 abhishekrb19 merged commit 96be82a into apache:master Dec 11, 2023
83 checks passed
@abhishekrb19 abhishekrb19 deleted the cleanup_dangling_segments branch December 11, 2023 16:57
@LakshSingla LakshSingla added this to the 29.0.0 milestone Jan 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants