Skip to content

Python Polars 1.18.0

Compare
Choose a tag to compare
@github-actions github-actions released this 24 Dec 08:51

🏆 Highlights

🚀 Performance improvements

  • Order observability optimizations (#20396)
  • Purge ChunkedArray Metadata (#20371)
  • Explicit transpose in new-streaming equi-join finalize (#20363)
  • Cache dtype on ExprIR (#20331)
  • Lower overhead for BytecodeParser on introspection of incompatible UDFs (#20280)

✨ Enhancements

  • Always resolve dynamic types in schema (#20406)
  • Support loading data from multiple Excel/ODS workbooks (#20404)
  • Add "drop_empty_cols" parameter for read_excel and read_ods (#20430)
  • Order observability optimizations (#20396)
  • Add FirstArgLossless supertype (#20394)
  • Add dt.replace (#19708)
  • Polars build for Pyodide (#20383)
  • Add Azure credential provider using DefaultAzureCredential() (#20384)
  • Add env var to ignore file cache allocate error (#20356)
  • Enable joins between compatible differing numeric key columns (#20332)
  • Cache dtype on ExprIR (#20331)
  • Serialize DataFrame/Series using IPC in serde (#20266)
  • Improve error message on SchemaError (#20326)
  • Use better error messages when opening files (#20307)
  • Add 'skip_lines' for CSV (#20301)
  • Allow subtraction of time dtype columns (#20300)
  • Add bin.reinterpret (#20263)
  • Allow decoding of non-Polars arrow dictionaries in Arrow and Parquet (#20248)
  • Streamline creation of empty frame from Schema (#20267)
  • Add cat.len_chars and cat.len_bytes (#20211)
  • Expose AexprArena (#20230)

🐞 Bug fixes

  • Fix nullable object in map_elements (#20422)
  • Properly handle to_physical_repr of nested types (#20413)
  • Properly raise UDF errors (#20417)
  • Workaround for mmap crash under Emscripten (#20418)
  • Fix using new_columns in scan_csv with compressed file (#20412)
  • Fix return type of Series.dt.add_business_days (#20402)
  • Fix decimal series dispatch (#20400)
  • Fix decimal arithmetic schema (#20398)
  • Raise on categorical search_sorted (#20395)
  • Fix plotting f-strings and docstrings (#20399)
  • Don't try to load non-existend List/FSL statistics (#20388)
  • Propagate nulls for float methods on all numeric types (#20386)
  • Add env var to ignore file cache allocate error (#20356)
  • Flip order on right join (#20358)
  • Correctly parse special float values in from_repr (#20351)
  • Fix incorrect object store caching for ADLS URI (#20357)
  • Use the same encoding for nullable as non-nullable arrays (#20323)
  • Improve error message on SchemaError (#20326)
  • Boolean optional slice pushdown (#20315)
  • Properly handle from_physical for List/Array (#20311)
  • Ignore quotes in csv comments (#20306)
  • Ensure pl.datetime returns empty column when input columns are empty (#20278)
  • Ensure output height does not change on lazy projection pushdown with aggregations (#20223)
  • Fix error writing on Windows to locations outside of C drive (#20245)
  • Incorrect comparison in some cases with filtered list/array columns (#20243)
  • Ensure height is maintained in SQL SELECT 1 FROM (#20241)
  • Properly account for updated Categorical in .unique() kernel (#20235)

📖 Documentation

  • Improve docstring clarity (#20416)
  • Update GPU engine installation instructions to remove --extra-index-url from CUDA 12 packages (#20381)
  • Remove Plugins overview page without information (#20348)
  • Small fixes/clarifications in user guide (#20335)
  • Improve docs about NaN (#20310)
  • Fix substr function param definition (#19054)
  • Include parquet options in BigQuery I/O write sample (#20292)
  • Fix typo in fork warning (#20258)

📦 Build system

  • Add project.dynamic = ["version"] to pyproject.toml (#20345)
  • Update pyo3 and numpy crates to version 0.23 (#20111)
  • Build wheels for ARM Windows in Python release workflow (#20247)

🛠️ Other improvements

  • Enable masked out list, struct and array elements in parametric tests (#20365)
  • Move hive partitioning/multi-file handling outside of readers (#20203)
  • Purge ChunkedArray Metadata (#20371)
  • Correcting misspelled return value and unifying regional spelling (#20375)
  • Add test for select(len()) (#20343)
  • Make parametric tests include pl.List and pl.Array by default (#20319)
  • Use Column in Row Encoding (#20312)
  • Don't warn on fork hook (#20309)
  • Don't deconstruct CsvParseOptions (#20302)
  • Allow decoding of non-Polars arrow dictionaries in Arrow and Parquet (#20248)
  • Prepare test suite for Python 3.13 support (#20297)
  • Add FunctionCastOptions and conservative IR-level cast type-checking (#20286)
  • Add more descriptive error message for failure of vstack/extend (#20299)
  • Clean up some remnants of Python 3.8 support (#20293)
  • Add new Int128Type (#20232)
  • Add test for BytesIO overwritten after scan (#20240)
  • Expose AexprArena (#20230)

Thank you to all our contributors for making this release possible!
@Jesse-Bakker, @Terrigible, @ZemanOndrej, @alexander-beedie, @balbok0, @beckernick, @bschoenmaeckers, @coastalwhite, @georgestagg, @hamdanal, @haocheng6, @kszlim, @lukemanley, @mcrumiller, @nameexhaustion, @noexecstack, @orlp, @ptiza, @r-brink, @ritchie46, @rodrigogiraoserrao, @stijnherfst, @stinodego, @tswast and @zero-stroke