v0.2.2
github-actions
released this
14 Nov 19:47
·
1072 commits
to refs/heads/main
since this release
Changes
- [CHORE] Edit 'make-hooks' command to install pre-commit script @colin-ho (#1602)
- [CHORE] Improve error messages when calling aggregation methods on dataframe without input columns @colin-ho (#1587)
✨ New Features
- [FEAT] Add translation of IOConfig to PyArrow filesystem arguments @jaychia (#1592)
- [FEAT] [Scan Operator] Refactor planning and execution code to use shared
Pushdowns
struct. @clarkzinzow (#1595) - [FEAT] [Scan Operator] Add
ChunkSpec
for specifying format-specific per-file row subset selection forScanTask
s. @clarkzinzow (#1590) - [FEAT] [Scan Operator] Integrate
size_bytes
withScanOperator
s @clarkzinzow (#1586) - [FEAT] [Scan Operator] Add Python I/O support (+ JSON) to
MicroPartition
reads @clarkzinzow (#1578) - [FEAT][ScanOperator 1/3] Add MVP e2e
ScanOperator
integration. @clarkzinzow (#1559)
🚀 Performance Improvements
- [PERF][REVERT] Reverts: use pyarrow table for pickling rather than ChunkedArray (#1488) @jaychia (#1605)
- [PERF] Speed Up MicroPartition Ops when we know the result is empty @samster25 (#1604)
👾 Bug Fixes
- [BUG] clean up ray scheduler threads after computing partial results @samster25 (#1597)
- [BUG] Update requirements for typing_extensions @jaychia (#1593)
- [BUG] Fix Deadlock with ScanOperators in
to_physical_plan_scheduler
and show iostats for glob and from_scan_task @samster25 (#1581) - [BUG] add allow threads for io pool operations @samster25 (#1580)
🧰 Maintenance
- [CHORE] delete unused wheel tools @samster25 (#1603)
- [CHORE] add IOStats to all micropartition ops @samster25 (#1584)
- [CHORE] Use DAFT_MICROPARTITIONS as shared feature flag for data catalog support @jaychia (#1579)
- [CHORE] Convert GlobScanOperator to perform streaming into result and take a list of glob paths @jaychia (#1577)
⬆️ Dependencies
- Bump numpy from 1.25.2 to 1.26.2 @dependabot (#1596)