Skip to content

Python Polars 1.2.0

Compare
Choose a tag to compare
@github-actions github-actions released this 16 Jul 16:14
38321a5

🚀 Performance improvements

  • Fix pathological perf issue in window-order-by (#17650)
  • Cache path resolving of scan functions (#17616)
  • Add ArrayChunks to optimize codegen of BatchDecoder (#17632)
  • Rechunk before we go into grouped gathers (#17623)
  • Cache schema resolve back to DSL (#17610)
  • Add fastpath for when rounding by single constant durations (#17580)
  • Improve parallelism in writing hive parquet (#17512)
  • Support datetime in predicate during hive partition pruning (#17545)
  • Batch nested embed parquet decoding (#17549)
  • Batch nested Parquet decoding (#17542)
  • Collect Parquet dictionary binary as view (#17475)

✨ Enhancements

  • Hugging Face path expansion (#17665)
  • Add DSL validation for cloud eligible check (#17287)
  • Raise informative error message if non-IntoExpr is passed by name in *Frame.group_by (#17654)
  • Add infer_schema parameter to read_csv / scan_csv (#17617)
  • Change API for writing partitioned Parquet to reduce code duplication (#17586)
  • Cache schema resolve back to DSL (#17610)
  • Expose returns_scalar to map_elements (#17613)
  • Add option to include file path for Parquet, IPC, CSV scans (#17563)
  • Support describe on decimal (#15092)
  • Support datetime in predicate during hive partition pruning (#17545)
  • Raise more informative error message for directories containing files with mixed extensions (#17480)
  • Exclude empty files from directory/glob expansion (#17478)
  • Support use of SQLAlchemy "Connectable" in write_database (#17470)

🐞 Bug fixes

  • Support duplicate expression names when calling ufuncs (#17641)
  • Interpret %y consistently with Chrono in to_date/to_datetime/strptime (#17661)
  • Fix explode invalid check (#17651)
  • Raise for overlapping index/column names in pandas dataframes post string coercion (#17628)
  • Expand brackets in async glob expansion (#17630)
  • Fix row index disappearing after projection pushdown in NDJSON (#17631)
  • Fix struct -> enum is_in (#17622)
  • Don't needlessly unwrap in pivot_schema (#17611)
  • Reject literal input in sort_by_exprs() (#17606)
  • Don't enforce row order in join test results where not guaranteed (#17596)
  • Bitmap collect into safety (#17588)
  • Make schema picklable (#17524)
  • Handle current position of file objects (#17543)
  • Set O_CLOEXEC on duplicated file descriptor (#17537)
  • Method dt.truncate was sometimes returning incorrect results for pre-1970 datetimes (#17582)
  • Defer path expansion until collect in file scan methods (#17532)
  • Fix retries parameter in scan functions not taking effect when it was set to 0 (#17564)
  • Don't unwrap send attempt to oneshot channel (#17566)
  • Fix scanning from HTTP cloud paths (#17571)
  • Properly implement struct (#17522)
  • Add right to lazyframe join docstring (#17529)
  • Fix predicate pushdown for .list.(get|gather) (#17511)
  • Make sure scan_ipc does not go through fsspec (#17495)
  • Turn panic into error when serializing Object types (#17353)
  • Fix struct expansion and raise on exclude (#17489)
  • Normalize path in sink_csv (#17476)

📖 Documentation

  • Update plot docs to refer to docstrings (#17504)
  • Rename str.lengths to str.len_bytes in description text (#11577) (#17626)
  • Create example for polars.Expr.bin.decode (#17508)
  • Add right join in the user guide (#17608)
  • Adjust rendering of links in read_database_uri docstring (#17536)
  • Update SQL examples in README (#17568)
  • Fixup "deprecated" directive for DataFrame.melt and LazyFrame.melt (#17530)
  • Add write_parquet_partitioned (#17488)
  • Add example for writing hive partitioned parquet to user guide (#17483)
  • Fix typo in Getting Started section of user guide (#17465)

🛠️ Other improvements

  • Add DSL validation for cloud eligible check (#17287)
  • Add ArrayChunks to optimize codegen of BatchDecoder (#17632)
  • Move path logic to from utils to path_utils in polars-io (#17635)
  • Fix struct gather (#17621)
  • Back to StructChunked name (#17609)
  • Remove unused with_column method of PyLazyFrame (#17607)
  • Re-enable struct related tests (#17597)
  • Completely redo structure of Parquet decoder (#17589)
  • Fix struct outer validity;fmt;is_in;cast;cmp (#17590)
  • Add/fix version-gating in some SQLAlchemy and Pandas tests (#17538)
  • Add style accessor to DataFrame (#17502)
  • Remove unused is_supported_cloud util (#17493)

Thank you to all our contributors for making this release possible!
@Julian-J-S, @MarcoGorelli, @alexander-beedie, @anergictcell, @arnabanimesh, @brandon-b-miller, @cmdlineluser, @coastalwhite, @deanm0000, @eitsupi, @flisky, @henryharbeck, @itamarst, @jonaylor89, @moritzwilksch, @nameexhaustion, @orlp, @phi-friday, @r-brink, @rcorty, @ritchie46, @ruihe774, @stinodego, @tylerriccio33 and @wence-