-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-14932: [Python] Add python bindings for JSON streaming reader #45084
GH-14932: [Python] Add python bindings for JSON streaming reader #45084
Conversation
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks mostly excellent @pan-x-c , thank you.
Here are a couple minor comments.
Also, could you add the new API to the docs in https://github.com/apache/arrow/blob/main/docs/source/python/api/formats.rst#json-files and perhaps mention it in https://github.com/apache/arrow/blob/main/docs/source/python/json.rst ?
I have fixed the above comments and some similar issues in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for the update, one comment remaining below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for your contribution @pan-x-c ! This looks good to me now, I will merge if CI is green.
After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit 16c7f1a. There were 8 benchmark results with an error:
There were no benchmark performance regressions. 🎉 The full Conbench report has more details. |
Rationale for this change
The C++ arrow has a JSON streaming reader which is not exposed on the Python interface.
What changes are included in this PR?
This PR is based on #33761. It adds the
open_json
method to open a streaming reader for a JSON file.Are these changes tested?
Yes
Are there any user-facing changes?
Yes. A new
open_json
method has been added to the Python interface, located atpyarrow.json.open_json
, and its parameters are the same as thepyarrow.json.read_json