Python Polars 1.18.0
🏆 Highlights
- Add new
Int128Type
(#20232)
🚀 Performance improvements
- Order observability optimizations (#20396)
- Purge ChunkedArray Metadata (#20371)
- Explicit transpose in new-streaming equi-join finalize (#20363)
- Cache dtype on ExprIR (#20331)
- Lower overhead for
BytecodeParser
on introspection of incompatible UDFs (#20280)
✨ Enhancements
- Always resolve dynamic types in schema (#20406)
- Support loading data from multiple Excel/ODS workbooks (#20404)
- Add "drop_empty_cols" parameter for
read_excel
andread_ods
(#20430) - Order observability optimizations (#20396)
- Add FirstArgLossless supertype (#20394)
- Add
dt.replace
(#19708) - Polars build for Pyodide (#20383)
- Add Azure credential provider using
DefaultAzureCredential()
(#20384) - Add env var to ignore file cache allocate error (#20356)
- Enable joins between compatible differing numeric key columns (#20332)
- Cache dtype on ExprIR (#20331)
- Serialize DataFrame/Series using IPC in serde (#20266)
- Improve error message on SchemaError (#20326)
- Use better error messages when opening files (#20307)
- Add 'skip_lines' for CSV (#20301)
- Allow subtraction of time dtype columns (#20300)
- Add
bin.reinterpret
(#20263) - Allow decoding of non-Polars arrow dictionaries in Arrow and Parquet (#20248)
- Streamline creation of empty frame from
Schema
(#20267) - Add
cat.len_chars
andcat.len_bytes
(#20211) - Expose AexprArena (#20230)
🐞 Bug fixes
- Fix nullable object in map_elements (#20422)
- Properly handle
to_physical_repr
of nested types (#20413) - Properly raise UDF errors (#20417)
- Workaround for
mmap
crash under Emscripten (#20418) - Fix using
new_columns
inscan_csv
with compressed file (#20412) - Fix return type of
Series.dt.add_business_days
(#20402) - Fix decimal series dispatch (#20400)
- Fix decimal arithmetic schema (#20398)
- Raise on categorical search_sorted (#20395)
- Fix plotting f-strings and docstrings (#20399)
- Don't try to load non-existend List/FSL statistics (#20388)
- Propagate nulls for float methods on all numeric types (#20386)
- Add env var to ignore file cache allocate error (#20356)
- Flip order on right join (#20358)
- Correctly parse special float values in
from_repr
(#20351) - Fix incorrect object store caching for ADLS URI (#20357)
- Use the same encoding for nullable as non-nullable arrays (#20323)
- Improve error message on SchemaError (#20326)
- Boolean optional slice pushdown (#20315)
- Properly handle
from_physical
for List/Array (#20311) - Ignore quotes in csv comments (#20306)
- Ensure pl.datetime returns empty column when input columns are empty (#20278)
- Ensure output height does not change on lazy projection pushdown with aggregations (#20223)
- Fix error writing on Windows to locations outside of C drive (#20245)
- Incorrect comparison in some cases with filtered list/array columns (#20243)
- Ensure height is maintained in SQL
SELECT 1 FROM
(#20241) - Properly account for updated Categorical in .unique() kernel (#20235)
📖 Documentation
- Improve docstring clarity (#20416)
- Update GPU engine installation instructions to remove
--extra-index-url
from CUDA 12 packages (#20381) - Remove Plugins overview page without information (#20348)
- Small fixes/clarifications in user guide (#20335)
- Improve docs about NaN (#20310)
- Fix substr function param definition (#19054)
- Include parquet options in BigQuery I/O write sample (#20292)
- Fix typo in
fork
warning (#20258)
📦 Build system
- Add
project.dynamic = ["version"]
to pyproject.toml (#20345) - Update
pyo3
andnumpy
crates to version0.23
(#20111) - Build wheels for ARM Windows in Python release workflow (#20247)
🛠️ Other improvements
- Enable masked out list, struct and array elements in parametric tests (#20365)
- Move hive partitioning/multi-file handling outside of readers (#20203)
- Purge ChunkedArray Metadata (#20371)
- Correcting misspelled return value and unifying regional spelling (#20375)
- Add test for
select(len())
(#20343) - Make parametric tests include
pl.List
andpl.Array
by default (#20319) - Use Column in Row Encoding (#20312)
- Don't warn on fork hook (#20309)
- Don't deconstruct
CsvParseOptions
(#20302) - Allow decoding of non-Polars arrow dictionaries in Arrow and Parquet (#20248)
- Prepare test suite for Python 3.13 support (#20297)
- Add
FunctionCastOptions
and conservative IR-level cast type-checking (#20286) - Add more descriptive error message for failure of vstack/extend (#20299)
- Clean up some remnants of Python 3.8 support (#20293)
- Add new
Int128Type
(#20232) - Add test for BytesIO overwritten after scan (#20240)
- Expose AexprArena (#20230)
Thank you to all our contributors for making this release possible!
@Jesse-Bakker, @Terrigible, @ZemanOndrej, @alexander-beedie, @balbok0, @beckernick, @bschoenmaeckers, @coastalwhite, @georgestagg, @hamdanal, @haocheng6, @kszlim, @lukemanley, @mcrumiller, @nameexhaustion, @noexecstack, @orlp, @ptiza, @r-brink, @ritchie46, @rodrigogiraoserrao, @stijnherfst, @stinodego, @tswast and @zero-stroke