Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(python)!: Remove serde functionality from pl.read_json and DataFrame.write_json #16550

Merged
merged 5 commits into from
Jun 4, 2024

Conversation

stinodego
Copy link
Member

@stinodego stinodego commented May 28, 2024

Closes #14526

Changes

  • pl.read_json no longer supports reading JSON files produced by DataFrame.serialize. Users should use pl.DataFrame.deserialize instead.
  • DataFrame.write_json now only writes row-oriented JSON. The parameters row_oriented and pretty have been removed. Users should use DataFrame.serialize to serialize a DataFrame.

Example - write_json

Before:

>>> df = pl.DataFrame({"a": [1, 2], "b": [3.0, 4.0]})
>>> df.write_json()
'{"columns":[{"name":"a","datatype":"Int64","bit_settings":"","values":[1,2]},{"name":"b","datatype":"Float64","bit_settings":"","values":[3.0,4.0]}]}'

After:

>>> df.write_json()  # Same behavior as previously `df.write_json(row_oriented=True)`
'[{"a":1,"b":3.0},{"a":2,"b":4.0}]'

Example - read_json

Before:

>>> import io
>>> df_ser = '{"columns":[{"name":"a","datatype":"Int64","bit_settings":"","values":[1,2]},{"name":"b","datatype":"Float64","bit_settings":"","values":[3.0,4.0]}]}'
>>> pl.read_json(io.StringIO(df_ser))
shape: (2, 2)
┌─────┬─────┐
│ a   ┆ b   │
│ --- ┆ --- │
│ i64 ┆ f64 │
╞═════╪═════╡
│ 1   ┆ 3.0 │
│ 2   ┆ 4.0 │
└─────┴─────┘

After:

>>> pl.read_json(io.StringIO(df_ser))  # Format no longer supported: data is treated as a single row
shape: (1, 1)
┌─────────────────────────────────┐
│ columns                         │
│ ---                             │
│ list[struct[4]]                 │
╞═════════════════════════════════╡
│ [{"a","Int64","",[1.0, 2.0]}, … │
└─────────────────────────────────┘

Use instead:

>>> pl.DataFrame.deserialize(io.StringIO(df_ser))
shape: (2, 2)
┌─────┬─────┐
│ a   ┆ b   │
│ --- ┆ --- │
│ i64 ┆ f64 │
╞═════╪═════╡
│ 1   ┆ 3.0 │
│ 2   ┆ 4.0 │
└─────┴─────┘

@github-actions github-actions bot added breaking Change that breaks backwards compatibility enhancement New feature or an improvement of an existing feature python Related to Python Polars labels May 28, 2024
Copy link

codecov bot commented May 28, 2024

Codecov Report

Attention: Patch coverage is 89.47368% with 2 lines in your changes are missing coverage. Please review.

Project coverage is 81.50%. Comparing base (7cfa80a) to head (cc8acd6).
Report is 4 commits behind head on main.

Files Patch % Lines
py-polars/polars/dataframe/frame.py 71.42% 2 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff            @@
##             main   #16550    +/-   ##
========================================
  Coverage   81.50%   81.50%            
========================================
  Files        1412     1412            
  Lines      185133   185254   +121     
  Branches     2987     2994     +7     
========================================
+ Hits       150885   150990   +105     
- Misses      33730    33748    +18     
+ Partials      518      516     -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@stinodego stinodego marked this pull request as ready for review May 28, 2024 13:25
@stinodego stinodego changed the title feat(python)!: Update pl.read_json and DataFrame.write_json functionality feat(python)!: Remove serde functionality from pl.read_json and DataFrame.write_json May 28, 2024
@stinodego stinodego merged commit b61d4e6 into main Jun 4, 2024
23 checks passed
@stinodego stinodego deleted the breaking-df-serde branch June 4, 2024 07:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Change that breaks backwards compatibility enhancement New feature or an improvement of an existing feature python Related to Python Polars
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Split off DataFrame serde functionality from read_json/DataFrame.write_json
1 participant