Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Core: Adapt commit, scan, and snapshot stats for DVs #11464

Merged
merged 1 commit into from
Nov 5, 2024

Conversation

aokolnychyi
Copy link
Contributor

This PR adapts commit, scan, and snapshot stats for DVs.

This work is part of #11122.

@github-actions github-actions bot added the core label Nov 4, 2024
this.addedDVs += 1;
} else {
this.addedPosDeleteFiles += 1;
}
this.addedDeleteFiles += 1;
Copy link
Contributor Author

@aokolnychyi aokolnychyi Nov 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am treating "delete files" as a generic concept. We will continue to track each type of delete as a delete file and will count DVs, position delete files, and equality delete files separately. I think this is consistent and won't require updating our manifest counts to differentiate between DVs and delete files.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this is consistent

long addedFileSize1 = dv2.contentSizeInBytes();
long totalPosDeletes1 = dv1.recordCount() + dv2.recordCount();
long totalFileSize1 = dv1.contentSizeInBytes() + dv2.contentSizeInBytes();
assertThat(summary1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missed, will do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to cover all values.

@@ -31,7 +32,11 @@ public static void indexedDeleteFile(ScanMetrics metrics, DeleteFile deleteFile)
metrics.indexedDeleteFiles().increment();

if (deleteFile.content() == FileContent.POSITION_DELETES) {
metrics.positionalDeleteFiles().increment();
if (ContentFileUtil.isDV(deleteFile)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be using ScanTaskUtil.isDV() now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replied with context here.

@@ -84,4 +85,8 @@ public static String referencedDataFileLocation(DeleteFile deleteFile) {
CharSequence location = referencedDataFile(deleteFile);
return location != null ? location.toString() : null;
}

public static boolean isDV(DeleteFile deleteFile) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this isn't probably needed anymore, since the same exists in ScanTaskUtil

Copy link
Contributor

@nastra nastra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a few comments

@aokolnychyi aokolnychyi merged commit 549674b into apache:main Nov 5, 2024
49 checks passed
@aokolnychyi
Copy link
Contributor Author

Thank you, @nastra!

zachdisc pushed a commit to zachdisc/iceberg that referenced this pull request Dec 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants