Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add equality delete writer #372

Merged
merged 12 commits into from
Oct 15, 2024

Conversation

Dysprosium0626
Copy link
Contributor

Close #341
EqualityDeleteWriter refers tohttps://iceberg.apache.org/spec/#equality-delete-files

This pr complete the implementation of EqualityDeleteWriter and its test case

impl<B: FileWriterBuilder> IcebergWriter for EqualityDeleteFileWriter<B> {
async fn write(&mut self, batch: RecordBatch) -> Result<()> {
let batch = self.project_record_batch_columns(batch)?;
self.inner_writer.as_mut().unwrap().write(&batch).await
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use ? rather than unwrap() here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

? is for Result but self.inner_writer is an Option. I think it is reasonable to leave it just Option. Maybe self.inner_writer.as_mut().ok_or_else(...).write(&batch).await is better?

@Dysprosium0626 Dysprosium0626 force-pushed the add_equality_delete_writer branch 2 times, most recently from 2aa2e5f to f4c6cfa Compare May 29, 2024 06:24
@Dysprosium0626
Copy link
Contributor Author

@ZENOTME @sdd PTAL! Thanks a lot!

Copy link
Contributor

@ZENOTME ZENOTME left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

crates/iceberg/src/writer/mod.rs Outdated Show resolved Hide resolved
} else {
Err(Error::new(
ErrorKind::Unexpected,
"Equality delete inner writer does not exist",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think returning an unexpected error here is reasonable. I will fix

/// After close, regardless of success or failure, the writer should never be used again, otherwise the writer will panic.
later.

@Dysprosium0626 Dysprosium0626 force-pushed the add_equality_delete_writer branch from 0cc7049 to d140194 Compare June 5, 2024 05:41
Copy link
Member

@Xuanwo Xuanwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Thank you @Dysprosium0626 for working on this, and thank @sdd & @ZENOTME for the review.

@Xuanwo Xuanwo merged commit ad89eac into apache:main Oct 15, 2024
6 checks passed
Xuanwo added a commit to Xuanwo/iceberg-rust that referenced this pull request Oct 15, 2024
@Xuanwo
Copy link
Member

Xuanwo commented Oct 15, 2024

Hi, I'm sorry, but I need to revert this PR. @Dysprosium0626, could you reopen and rebase your original PR and test it again?

@Dysprosium0626
Copy link
Contributor Author

Hi, I'm sorry, but I need to revert this PR. @Dysprosium0626, could you reopen and rebase your original PR and test it again?

Okay, I will reopen this PR after test it again.

ZENOTME pushed a commit to ZENOTME/iceberg-rust that referenced this pull request Nov 19, 2024
@ZENOTME
Copy link
Contributor

ZENOTME commented Nov 19, 2024

Hi, I find that this PR has some fail. I try to fix it in #703. To simplify the review, I separate this PR and fix into two commits. Feel free to tell me if something needs to be refined or fixed. cc @Dysprosium0626 @Xuanwo

ZENOTME pushed a commit to ZENOTME/iceberg-rust that referenced this pull request Nov 27, 2024
liurenjie1024 pushed a commit that referenced this pull request Nov 28, 2024
* copy from #372

* fix test and refine ArrowFieldProjector

* add comment and rename to make code more clear

* refine code

* refine RecordBatchProjector

* refine error

* refine function name

---------

Co-authored-by: ZENOTME <[email protected]>
shaeqahmed pushed a commit to matanolabs/iceberg-rust that referenced this pull request Dec 9, 2024
* feat: add EqualityDeleteWriter

* WIP: add test cases

* fix: move delete schema out of writer

* test: add test case for equality delete writer

* fix: refactor projector

* fix: fix projector

* fix: add result

* test: add float and double column test for equality delete writer

* fmt

* fix: compatibility with apache#364

* fix: remove unwrap

* fix: minor
shaeqahmed pushed a commit to matanolabs/iceberg-rust that referenced this pull request Dec 9, 2024
shaeqahmed pushed a commit to matanolabs/iceberg-rust that referenced this pull request Dec 9, 2024
* copy from apache#372

* fix test and refine ArrowFieldProjector

* add comment and rename to make code more clear

* refine code

* refine RecordBatchProjector

* refine error

* refine function name

---------

Co-authored-by: ZENOTME <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement the equality delete writer
4 participants