Skip to content

v0.2.2

Compare
Choose a tag to compare
@github-actions github-actions released this 14 Nov 19:47
· 1072 commits to refs/heads/main since this release
6abc006

Changes

  • [CHORE] Edit 'make-hooks' command to install pre-commit script @colin-ho (#1602)
  • [CHORE] Improve error messages when calling aggregation methods on dataframe without input columns @colin-ho (#1587)

✨ New Features

  • [FEAT] Add translation of IOConfig to PyArrow filesystem arguments @jaychia (#1592)
  • [FEAT] [Scan Operator] Refactor planning and execution code to use shared Pushdowns struct. @clarkzinzow (#1595)
  • [FEAT] [Scan Operator] Add ChunkSpec for specifying format-specific per-file row subset selection for ScanTasks. @clarkzinzow (#1590)
  • [FEAT] [Scan Operator] Integrate size_bytes with ScanOperators @clarkzinzow (#1586)
  • [FEAT] [Scan Operator] Add Python I/O support (+ JSON) to MicroPartition reads @clarkzinzow (#1578)
  • [FEAT][ScanOperator 1/3] Add MVP e2e ScanOperator integration. @clarkzinzow (#1559)

🚀 Performance Improvements

  • [PERF][REVERT] Reverts: use pyarrow table for pickling rather than ChunkedArray (#1488) @jaychia (#1605)
  • [PERF] Speed Up MicroPartition Ops when we know the result is empty @samster25 (#1604)

👾 Bug Fixes

  • [BUG] clean up ray scheduler threads after computing partial results @samster25 (#1597)
  • [BUG] Update requirements for typing_extensions @jaychia (#1593)
  • [BUG] Fix Deadlock with ScanOperators in to_physical_plan_scheduler and show iostats for glob and from_scan_task @samster25 (#1581)
  • [BUG] add allow threads for io pool operations @samster25 (#1580)

🧰 Maintenance

  • [CHORE] delete unused wheel tools @samster25 (#1603)
  • [CHORE] add IOStats to all micropartition ops @samster25 (#1584)
  • [CHORE] Use DAFT_MICROPARTITIONS as shared feature flag for data catalog support @jaychia (#1579)
  • [CHORE] Convert GlobScanOperator to perform streaming into result and take a list of glob paths @jaychia (#1577)

⬆️ Dependencies