-
Notifications
You must be signed in to change notification settings - Fork 245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add support for "take" operation to balanced storage #3079
Conversation
|
afeb3f5
to
5285b86
Compare
5285b86
to
27974c6
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3079 +/- ##
==========================================
- Coverage 77.31% 77.16% -0.15%
==========================================
Files 240 240
Lines 79322 79561 +239
Branches 79322 79561 +239
==========================================
+ Hits 61326 61393 +67
- Misses 14821 15014 +193
+ Partials 3175 3154 -21
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
# TODO: Once https://github.com/lancedb/lance/pull/3041 merges we will | ||
# want to test partial appends. We need to make sure an append of | ||
# non-blob data is supported. In order to do this we need to make | ||
# sure a blob tx is created that marks the row ids as used so that | ||
# the two row id sequences stay in sync. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Thanks for testing the interaction
/// Splits the schema into two schemas, one with default storage class fields and the other with blob storage class fields. | ||
/// If there are no blob storage class fields, the second schema will be `None`. | ||
/// The order of fields is preserved. | ||
pub fn partition_by_storage_class(&self) -> (Self, Option<Self>) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can only top-level fields be StorageClass::Blob
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that is simplest. I should probably add that restriction. We can always add support for sub-fields in the future.
Adds support for "take" to datasets that have balanced storage
Requires balanced storage datasets to use stable ids
Refactors
RowAddress
slightly to removeid
from the method names (e.g.row_id
->row_offset
) to help avoid future confusion.Fixes an intermittent panic in
SharedStream
(this panic had been seen before but now that we are usingSharedStream
more it was showing up more frequently)Begins aligning on new terminology for balanced storage (the old "blob storage" terminology is confusing because other things are also called "blobs":
Closes #3030