-
Notifications
You must be signed in to change notification settings - Fork 839
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update parquet to depend on arrow subcrates #3028
Conversation
@@ -30,6 +30,14 @@ edition = "2021" | |||
rust-version = "1.62" | |||
|
|||
[dependencies] | |||
arrow-array = { version = "26.0.0", path = "../arrow-array", default-features = false, optional = true } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is somewhat unfortunate the number of these, perhaps we should provide re-exports to reduce this. On the flip side, parquet is a very complex crate and so perhaps it is just a bit special in needing all the things 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe for some basic crates like arrow-buffer, arrow-data, arrow-schema, perhaps we can provide re-export (arrow-core?) for them.
Like you said, if this is just a special case, then it is fine.
Need to revisit how we plumb in test utilities 😢 |
f20ea2a
to
620a04c
Compare
syn = { version = "1.0", features = ["extra-traits"] } | ||
parquet = { path = "../parquet", version = "26.0.0", default-features = false } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is part drive-by-cleanup, part fix as it was relying on multiversion
which is a transitive dependency of arrow
to enable the syn
features it requires. As parquet no longer depends on arrow, this caused issues
@@ -19,8 +19,8 @@ | |||
|
|||
use std::{cell, io, result, str}; | |||
|
|||
#[cfg(any(feature = "arrow", test))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is necessary to avoid having to declare all the various subcrates as dev-dependencies
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean removing test
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this only worked because arrow is a dev-dependency
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I tried to restore only this line locally, and looks like the tests are still okay to run?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you do cargo test --no-default-features
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yea, arrow
is default feature. I didn't exclude it.
default = ["arrow", "snap", "brotli", "flate2", "lz4", "zstd", "base64"] | ||
# Enable arrow reader/writer APIs | ||
arrow = ["dep:arrow", "base64"] | ||
arrow = ["base64", "arrow-array", "arrow-buffer", "arrow-cast", "arrow-data", "arrow-schema", "arrow-select", "arrow-ipc"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems two lines can combined?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nvm, I thought if this arrow
is unnecessary and we can just have these "base64" features. But I saw you use arrow
as a whole feature in many places.
#[cfg(feature = "arrow")] | ||
pub(crate) fn peek_next(&mut self) -> Result<bool> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, is peek_next
optional? It is used by some APIs like skip_records
. If arrow
is not enabled, what would happen?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes were necessary to placate clippy as they aren't currently used by anything outside the arrow module
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
Benchmark runs are scheduled for baseline = 132152c and contender = 4dd7fea. 4dd7fea is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
Which issue does this PR close?
Part of #2594
Rationale for this change
time cargo build --release
beforeAfter
The tall pole is now actually the compression codecs and not arrow 🎉
What changes are included in this PR?
Are there any user-facing changes?