-
Notifications
You must be signed in to change notification settings - Fork 847
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Write backwards compatible row group statistics (#3526) #3527
Write backwards compatible row group statistics (#3526) #3527
Conversation
Test failures fixed in #3528 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM -- thank you @tustvold . I think this will be a very nice feature addition for othr users.
I had a bunch of naming / doc / style suggestions but nothing required I don't think
I also tested the reproducer from #799 from @tfiasco and the min/max values are properly written 👍
In [3]:
...: import pyarrow.parquet as pq
...: f = pq.ParquetFile('/tmp//test.parquet')
...: print(f.metadata.row_group(0).column(0).statistics)
<pyarrow._parquet.Statistics object at 0x110118c70>
has_min_max: True
min: 1
max: 5
null_count: 0
distinct_count: 0
num_values: 5
physical_type: INT32
logical_type: None
converted_type (legacy): NONE
In [4]:
I've updated the docs to hopefully be a bit clearer, PTAL. I dropped the mentions of parquet v1, as I'm not entirely sure when the deprecation actually happened - the parquet changelog isn't particularly clear on this aspect... |
Benchmark runs are scheduled for baseline = 5a7ec46 and contender = 95cf030. 95cf030 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
Thanks @tustvold |
Which issue does this PR close?
Closes #3526
Closes #799
Rationale for this change
This helps with compatibility with pyarrow, which doesn't understand min_value yet apache/arrow#13976
What changes are included in this PR?
Are there any user-facing changes?