Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add close method to RecordBatchWriter trait #4228

Merged
merged 4 commits into from
May 17, 2023

Conversation

alexandreyc
Copy link
Contributor

Which issue does this PR close?

This PR does not close any particular issue. It is an alternative implementation of #4221

Rationale for this change

In #4221, the finish method takes &mut self whereas in this PR it takes self.

We are forced to use fully qualified method calls for finish where we want to call finish on the object itself because RecordBatchWriter is in scope (because we use arrow_array::*). What's your opinion on importing only needed items from arrow_array to avoid this?

What changes are included in this PR?

Add finish method to RecordBatchWriter and implement it for CSV, JSON, IPC and Parquet writers.

Are there any user-facing changes?

According to my analysis no breaking change is introduced in this PR.

@github-actions github-actions bot added arrow Changes to the arrow crate parquet Changes to the parquet crate labels May 16, 2023
Copy link
Contributor

@tustvold tustvold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if calling the method close would instead avoid so many collisions, and might better hint at its consuming nature (maybe??)

parquet/src/arrow/arrow_writer/mod.rs Outdated Show resolved Hide resolved
@@ -47,6 +47,9 @@ pub trait RecordBatchReader: Iterator<Item = Result<RecordBatch, ArrowError>> {
pub trait RecordBatchWriter {
/// Write a single batch to the writer.
fn write(&mut self, batch: &RecordBatch) -> Result<(), ArrowError>;

/// Write footer or termination data, then mark the writer as done.
fn finish(self) -> Result<(), ArrowError>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fn finish(self) -> Result<(), ArrowError>;
fn close(self) -> Result<(), ArrowError>;

Perhaps, this might avoid the name collisions?

@alexandreyc alexandreyc changed the title [Alternative] Add finish method to RecordBatchWriter trait Add finish method to RecordBatchWriter trait May 16, 2023
@alexandreyc
Copy link
Contributor Author

Should be good now! Thank you for your review.

@tustvold tustvold changed the title Add finish method to RecordBatchWriter trait Add close method to RecordBatchWriter trait May 16, 2023
@tustvold
Copy link
Contributor

For those following along, RecordBatchWriter hasn't yet been released so not marking as a breaking change

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate parquet Changes to the parquet crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants