Releases: mars-project/mars
v0.9.0b2
This is the release notes of v0.9.0b2. See here for the complete list of solved issues and merged PRs.
New Features
- Metric
- Add metric framework (#2742, thanks @zhongchun!)
- Add prometheus metric implementation (#2752, thanks @zhongchun!)
- Add ray metrics implementation (#2749, thanks @zhongchun!)
- Add common metrics (#2760, thanks @zhongchun!)
Enhancements
- Simplify rechunk implementation (#2745)
- Stop inferring outputs when args provided (#2759)
- Add broadcast merge support for DataFrame (#2772)
- Remove deprecate warnings when import mars.tensor (#2788)
- Optimize in-process actor calls (#2763)
- [ray] New ray actor creation model (#2783)
Bug fixes
- Fix duplicate dec object ref (#2741, thanks @Catch-Bull!)
- Fix long exception of asyncio.gather (#2748)
- Fix NameError: name 'pq' is not defined if pyarrow is not installed (#2751)
- Fix profiling band_subtasks and most_calls are empty if the slow duration is large (#2755)
- Fix the wrong result of df.merge (#2774)
- Fix DataFrame initializer when Mars object exists in list (#2770)
- [ray] support ray client mode (#2773)
Tests
- Increase test stability for command-line tests (#2779)
v0.8.3
This is the release notes of v0.8.3. See here for the complete list of solved issues and merged PRs.
Enhancements
- Stop inferring outputs when args provided (#2761)
- Remove deprecate warnings when import mars.tensor (#2790)
- [Ray] New ray actor creation model (#2794)
Bug fixes
- Fix long exception of asyncio.gather (#2753)
- Fix wrong result of
df.merge
(#2777) - Fix DataFrame initializer when Mars object exists in list (#2778)
- Fix duplicate dec object ref (#2789, thanks @Catch-Bull!)
- [Ray] Support Ray client mode (#2796)
Tests
- Increase test stability for command-line tests (#2786)
v0.9.0b1
This is the release notes of v0.9.0b1. See here for the complete list of solved issues and merged PRs.
Highlights
- A new coloring-based fusion algorithm is introduced in #2719, performance is expected to have a significant increase compared to previous releases, however, some unexpected situations may happen, feel free to reach out to us if you find any.
New Features
- DataFrame
- Support
inclusive
argument forpd.date_range
(#2718)
- Support
- Others
Enhancements
- Refine failure recovery log and exception (#2633)
- Optimize eval-setitem expressions as single eval expressions (#2695)
- Auto merge small chunks when
df.groupby().apply(func)
is doing aggregation (#2708) - Optimize GroupBy's aggregation algorithm (#2696)
- [Ray] refine ray dataset integration (#2705)
- Improve profiling (#2629)
- Add support for reading partitioned parquet for fastparquet (#2724)
- Introduce coloring based fusion algorithm (#2719)
- Fix duplicate exceptions in log (#2723)
Bug fixes
- Fix
sort_values
for empty DataFrame or Series (#2681) - Eliminate redundant eval node in optimization (#2683)
- Avoid iterative tiling for
df.loc[:, fields]
(#2685) - [hotfix][ray] fix ray dataset compatibility (#2693)
- Fix
use_arrow_dtype
parameter forread_parquet
(#2698) - Fix error on dependent DataFrame setitems (#2701)
- Fix
estimate_pandas_size
forpd.MultiIndex
(#2707) - Import vineyard.data.pickle to make members available. (#2714)
- Fix shuffle when ndim of input tensors are different (#2727)
Documentation
v0.8.2
This is the release notes of v0.8.2. See here for the complete list of solved issues and merged PRs.
New Features
- DataFrame
- Support
inclusive
argument forpd.date_range
(#2721)
- Support
Enhancements
- Optimize eval-setitem expressions as single eval expressions (#2699)
- [Ray] Refine raydataset integration (#2712)
- [Ray] refine ray dataset integration (#2726)
- Add support for reading partitioned parquet for fastparquet (#2729)
- Fix duplicate exceptions in log (#2736)
Bug fixes
- Fix
sort_values
for empty DataFrame or Series (#2686) - Eliminate redundant eval node in optimization (#2688)
- Avoid iterative tiling for
df.loc[:, fields]
(#2689) - Fix
use_arrow_dtype
parameter forread_parquet
(#2702) - Fix error on dependent DataFrame setitems (#2703)
- Fix
estimate_pandas_size
onpd.MultiIndex
(#2710) - Import vineyard.data.pickle to make members available (#2716)
- Fix shuffle when ndim of input tensors are different (#2728)
v0.9.0a2
This is the release notes of v0.9.0a2. See here for the complete list of solved issues and merged PRs.
New Features
- DataFrame
- Others
- Add wheel support for Python 3.10 and drop Python 3.6 (#2622)
Enhancements
- Added merging small files support for
md.{read_parquet, read_csv}
(#2661) - Add support for HTTP request rewriter (#2664)
- Optimize filtering DataFrame with its fields (#2571)
- Add pyproject.toml to config build packages (#2674)
Bug fixes
- Fix backward compatibility for pandas 1.1 and 1.2 (#2624)
- Fix backward compatibility for pandas 1.0 (#2628)
- Fix
NotImplementedError
formo.batch
when single call not implemented (#2635) - Fix
IndexError
raise by aggregation of DataFrameGroupBy (#2641) - Fix compatibility for pandas 1.4 (#2650)
- Fix df.loc[:] to make sure same index_value key generated (#2643)
- Fix aggregation with comparison (#2647)
- Fix the wrong index_value generated by df.loc[:] (#2658)
- Fix optimizing DataFrame query with timestamp in conditions (#2671)
- Fix
as_index
when calling agg on SeriesGroupBy (#2676)
v0.8.1
This is the release notes of v0.8.1. See here for the complete list of solved issues and merged PRs.
New Features
- DataFrame
Enhancements
- Add support for HTTP request rewriter (#2665)
- Add merging small files support for
md.{read_parquet, read_csv}
(#2669) - Optimize filtering DataFrame with its fields (#2668)
Bug fixes
- Allow specifying multiple supervisor processes (#2625)
- Fix backward compatibility for pandas 1.0 (#2630)
- Fix
NotImplementedError
formo.batch
when single call not implemented (#2637) - Fix compatibility for pandas 1.4 (#2652)
- Fix
IndexError
raise by aggregation of DataFrameGroupBy (#2653) - Fix df.loc[:] to make sure same index_value key generated (#2654)
- Fix aggregation with comparison (#2655)
- Fix the wrong index_value generated by df.loc[:] (#2666)
- Fix
as_index
when calling groupby-agg (#2678)
v0.9.0a1
This is the release notes of v0.9.0a1. See here for the complete list of solved issues and merged PRs.
New Features
- Tensor
- Implements
mt.bincount
(#2548)
- Implements
- DataFrame
- Support Series.median() (#2566, thanks @perfumescent!)
- Learn
- Add
mars.learn.metrics.multilabel_confusion_matrix
and derivative metrics (#2554)
- Add
- Services
- Add basic profiling support for supervisor (#2586)
Enhancements
- Add app_queue in new_cluster (#2550, thanks @xxxxsk!)
- Implement web API of
get_infos
(#2558) - Reduce time cost of
cpu_percent()
calls (#2567) - Reduce estimation time cost (#2577)
- [ray] refine mars on ray usability (#2580)
- [ray] Refine raydataset integration (#2579)
- Optimize tileable graph construction (#2583)
- Stop calling user funcs when dtypes is specified (#2587)
- Supports adding Mars extensions via setup entrypoints (#2589)
- Skip details of shuffled chunks in meta (#2600)
- Reduce the time cost of fetching tileable data (#2594)
- Use batched request to apply for slots (#2601)
- Reduce RPC cost of oscar by removing unnecessary tasks (#2597)
Bug fixes
- Fix index
series.apply
when result index unchanged (#2557) - Stop using asdict to handle dataclasses (#2561)
- Fix tests under cudf 21.10 (#2608)
- Fix DataFrame getitem when exists duplicate columns (#2581)
- Upgrade required version of vineyard. (#2588)
- Fix progress always is 0 or 100% (#2591)
- Make Proxima work with latest Mars (#2599, thanks @yuyiming!)
- Fix None dtype for some unary tensor functions (#2603)
- Fix duplicate decref of subtask input chunk (#2611, thanks @Catch-Bull!)
Documentation
- Add a document about how to implement a Mars operand (#2562)
v0.8.0
This is the release notes of v0.8.0. See here for the complete list of solved issues and merged PRs.
This release note only covers the difference from v0.8.0rc1; for all highlights and changes, please refer to the release notes of the pre-releases:
alpha1
alpha2
alpha3
beta1
beta2
rc1
New Features
- Tensor
- Implements
mt.bincount
(#2552)
- Implements
- DataFrame
- Support
Series.median
(#2570, thanks @perfumescent!)
- Support
- Learn
- Add
mars.learn.metrics.multilabel_confusion_matrix
and derivative metrics (#2568)
- Add
Enhancements
- Implement web API of
get_infos
(#2564) - Reduce time cost of cpu_percent() calls (#2572)
- Stop calling user funcs when dtypes is specified (#2596)
- Supports adding Mars extensions via setup entrypoints (#2598)
- [Ray] Refine mars on ray usability (#2606)
- Reduce estimation time cost (#2607)
- Skip details of shuffled chunks in meta (#2609)
- Reduce the time cost of fetching tileable data (#2616)
- Reduce RPC cost of oscar by removing unnecessary tasks (#2613)
- Use batched request to apply for slots (#2615)
Bug fixes
- Fix index series.apply when result index unchanged (#2563)
- Fix DataFrame getitem when exists duplicate columns (#2582)
- Upgrade required version of vineyard (#2593)
- Fix progress always is 0 or 100% (#2595)
- Fix None dtype for some unary tensor functions (#2604)
- Make Proxima work with latest Mars (#2605, thanks @yuyiming!)
- Fix tests for cudf 21.10 (#2608)
- Fix duplicate decref of subtask input chunk (#2614, thanks @Catch-Bull!)
v0.8.0rc1
This is the release notes of v0.8.0rc1. See here for the complete list of solved issues and merged PRs.
New Features
- Tensor
- DataFrame
- Learn
- Services
- Add web API for scheduling (#2533)
- Web
- Display tileable properties on web (#2525, thanks @RandomY-2!)
- Others
- Support mutable tensor on oscar (#2432, thanks @Coco58323!)
- Add experimental support for CUDA under WSL for Windows 11 (#2538)
Enhancements
Bug fixes
- Fix output of
df.groupby(as_index=False).size()
(#2507) - [Ray] Fix web serialize lambda (#2512)
- Fix reduction result on empty series (#2520)
- Fix
DataFrame.loc
when df is empty (#2524) - Fix
df.loc
when providing empty list (#2528)
Documentation
- Add doc for reading csv in oss (#2514, thanks @Catch-Bull!)
v0.7.5
This is the release notes of v0.7.5. See here for the complete list of solved issues and merged PRs.
New Features
- Tensor
- DataFrame
- Learn
- Services
- Add web API for scheduling (#2535)
- Web
- Display tileable properties on web (#2539, thanks @RandomY-2!)
- Others
- Add experimental support for CUDA under WSL for Windows 11 (#2543)
Enhancements
- Reduce indentation of frontend code (#2541)
Bug fixes
- Fix output of
df.groupby(as_index=False).size()
(#2508) - Fix reduction result on empty series (#2522)
- Fix
df.loc
when df is empty (#2526) - [Ray] Fix serializing lambdas in web (#2529)
- Fix
df.loc
when providing empty list (#2532)
Documentation
- Add doc for reading csv in oss (#2530, thanks @Catch-Bull!)