-
Notifications
You must be signed in to change notification settings - Fork 245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: add deletion files to Lance format #920
Conversation
86e8efd
to
742dfa8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks good. I never reviewed rust before. I will do a second pass in the morning just to make sure I'm not missing something obvious.
fn deletion_arrow_schema() -> Arc<Schema> { | ||
Arc::new(Schema::new(vec![Field::new( | ||
"row_id", | ||
DataType::UInt32, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
potential tangent: maybe we want 64 bit uint to be safe? Just so that we don't have a "small" limit of number of rows we can store.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this isn't well documented, but the row ids elsewhere are a concatenation of the fragment id and the within-fragment row id (which is the position / offset of the row in the file).
Lines 120 to 123 in 505dbcd
/// Compute row id from `fragment_id` and the `offset` of the row in the fragment. | |
fn compute_row_id(fragment_id: u64, offset: i32) -> u64 { | |
(fragment_id << 32) + offset as u64 | |
} |
So I think we already have a limit on the number of rows we can store per file based on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense. thank you for explaining!
Closes #908