-
Notifications
You must be signed in to change notification settings - Fork 924
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Relax Arrow pin #16681
Relax Arrow pin #16681
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my local testing I found that the minimum required version might be 15.0.0 because __arrow_c_stream__
seems to be added in 15.0.0:
But it shows up in changelog of 16.0.0 only(and not 15.0.0):
However, if I install a 14.0.0 version of pyarrow, I still have __arrow_c_stream__
- It might have been possibly back-ported. is that the case? I'm not able to find history of this being back-ported.
These changes look good. But should be rather keep an upper bound pin on the pyarrow
version because of the same reasons we do for other libraries?
Locally verified cudf
& dask-cudf
work quite well for all of the following versions (barring the failures that you already covered with pytest skips in this PR):
- 14.0.0
- 15.0.0
- 15.0.1
- 16.0.0
- 16.1.0
- 17.0.0
The original capsule interface was added in apache/arrow#37797. apache/arrow#39455 looks to be specifically providing the stream capsule for chunked arrays. I believe that we use the |
Well, the answer is not quite as good as I had hoped. I was right that our current conversion function for ChunkedArray tries to use the |
Yup, sounds good to me. 👍 |
Cool. Added upper bounds as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to go! Thanks @vyasr !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one comment, but marking this "approve" so you don't need to wait on another review from me after addressing that. @galipremsagar 's testing in #16681 (review) and the discussion after that makes me very confident in this change.
For the record in case it helps inspire future confidence, independently of Prem I also ran the test suite on all the Python versions that he listed with similar results (which is what revealed the skips that I added to the test suite). |
/merge |
Description
With this change, cudf users can install any version of pyarrow greater than 14. This is the minimum version supporting the C Data Interface, which is a requirement for us (it may be possible to relax in principle, but would require changes to the cudf/pylibcudf code). A few tests are skipped due to bugs or missing features in older versions of pyarrow.
Checklist