Skip to content

Releases: mars-project/mars

v0.9.0b2

08 Mar 07:29
9b3cc49
Compare
Choose a tag to compare
v0.9.0b2 Pre-release
Pre-release

This is the release notes of v0.9.0b2. See here for the complete list of solved issues and merged PRs.

New Features

Enhancements

  • Simplify rechunk implementation (#2745)
  • Stop inferring outputs when args provided (#2759)
  • Add broadcast merge support for DataFrame (#2772)
  • Remove deprecate warnings when import mars.tensor (#2788)
  • Optimize in-process actor calls (#2763)
  • [ray] New ray actor creation model (#2783)

Bug fixes

  • Fix duplicate dec object ref (#2741, thanks @Catch-Bull!)
  • Fix long exception of asyncio.gather (#2748)
  • Fix NameError: name 'pq' is not defined if pyarrow is not installed (#2751)
  • Fix profiling band_subtasks and most_calls are empty if the slow duration is large (#2755)
  • Fix the wrong result of df.merge (#2774)
  • Fix DataFrame initializer when Mars object exists in list (#2770)
  • [ray] support ray client mode (#2773)

Tests

  • Increase test stability for command-line tests (#2779)

v0.8.3

08 Mar 08:24
d3a8af1
Compare
Choose a tag to compare

This is the release notes of v0.8.3. See here for the complete list of solved issues and merged PRs.

Enhancements

  • Stop inferring outputs when args provided (#2761)
  • Remove deprecate warnings when import mars.tensor (#2790)
  • [Ray] New ray actor creation model (#2794)

Bug fixes

  • Fix long exception of asyncio.gather (#2753)
  • Fix wrong result of df.merge (#2777)
  • Fix DataFrame initializer when Mars object exists in list (#2778)
  • Fix duplicate dec object ref (#2789, thanks @Catch-Bull!)
  • [Ray] Support Ray client mode (#2796)

Tests

  • Increase test stability for command-line tests (#2786)

v0.9.0b1

21 Feb 13:23
b747a42
Compare
Choose a tag to compare
v0.9.0b1 Pre-release
Pre-release

This is the release notes of v0.9.0b1. See here for the complete list of solved issues and merged PRs.

Highlights

  • A new coloring-based fusion algorithm is introduced in #2719, performance is expected to have a significant increase compared to previous releases, however, some unexpected situations may happen, feel free to reach out to us if you find any.

New Features

  • DataFrame
    • Support inclusive argument for pd.date_range (#2718)
  • Others
    • Add cibuildwheel with Linux AArch64 wheel build support (#2672, thanks @odidev!)

Enhancements

  • Refine failure recovery log and exception (#2633)
  • Optimize eval-setitem expressions as single eval expressions (#2695)
  • Auto merge small chunks when df.groupby().apply(func) is doing aggregation (#2708)
  • Optimize GroupBy's aggregation algorithm (#2696)
  • [Ray] refine ray dataset integration (#2705)
  • Improve profiling (#2629)
  • Add support for reading partitioned parquet for fastparquet (#2724)
  • Introduce coloring based fusion algorithm (#2719)
  • Fix duplicate exceptions in log (#2723)

Bug fixes

  • Fix sort_values for empty DataFrame or Series (#2681)
  • Eliminate redundant eval node in optimization (#2683)
  • Avoid iterative tiling for df.loc[:, fields] (#2685)
  • [hotfix][ray] fix ray dataset compatibility (#2693)
  • Fix use_arrow_dtype parameter for read_parquet (#2698)
  • Fix error on dependent DataFrame setitems (#2701)
  • Fix estimate_pandas_size for pd.MultiIndex (#2707)
  • Import vineyard.data.pickle to make members available. (#2714)
  • Fix shuffle when ndim of input tensors are different (#2727)

Documentation

v0.8.2

21 Feb 13:33
da6631e
Compare
Choose a tag to compare

This is the release notes of v0.8.2. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Support inclusive argument for pd.date_range (#2721)

Enhancements

  • Optimize eval-setitem expressions as single eval expressions (#2699)
  • [Ray] Refine raydataset integration (#2712)
  • [Ray] refine ray dataset integration (#2726)
  • Add support for reading partitioned parquet for fastparquet (#2729)
  • Fix duplicate exceptions in log (#2736)

Bug fixes

  • Fix sort_values for empty DataFrame or Series (#2686)
  • Eliminate redundant eval node in optimization (#2688)
  • Avoid iterative tiling for df.loc[:, fields] (#2689)
  • Fix use_arrow_dtype parameter for read_parquet (#2702)
  • Fix error on dependent DataFrame setitems (#2703)
  • Fix estimate_pandas_size on pd.MultiIndex (#2710)
  • Import vineyard.data.pickle to make members available (#2716)
  • Fix shuffle when ndim of input tensors are different (#2728)

v0.9.0a2

03 Feb 08:19
56efd30
Compare
Choose a tag to compare
v0.9.0a2 Pre-release
Pre-release

This is the release notes of v0.9.0a2. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Add support for GroupBy.{ffill, bfill,fillna} (#2639, thanks @Marascax!)
    • Add nunique support for DataFrameGroupBy (#2662)
  • Others
    • Add wheel support for Python 3.10 and drop Python 3.6 (#2622)

Enhancements

  • Added merging small files support for md.{read_parquet, read_csv} (#2661)
  • Add support for HTTP request rewriter (#2664)
  • Optimize filtering DataFrame with its fields (#2571)
  • Add pyproject.toml to config build packages (#2674)

Bug fixes

  • Fix backward compatibility for pandas 1.1 and 1.2 (#2624)
  • Fix backward compatibility for pandas 1.0 (#2628)
  • Fix NotImplementedError for mo.batch when single call not implemented (#2635)
  • Fix IndexError raise by aggregation of DataFrameGroupBy (#2641)
  • Fix compatibility for pandas 1.4 (#2650)
  • Fix df.loc[:] to make sure same index_value key generated (#2643)
  • Fix aggregation with comparison (#2647)
  • Fix the wrong index_value generated by df.loc[:] (#2658)
  • Fix optimizing DataFrame query with timestamp in conditions (#2671)
  • Fix as_index when calling agg on SeriesGroupBy (#2676)

v0.8.1

03 Feb 08:27
211e3b7
Compare
Choose a tag to compare

This is the release notes of v0.8.1. See here for the complete list of solved issues and merged PRs.

New Features

  • DataFrame
    • Add support for GroupBy.{ffill, bfill,fillna} (#2657, thanks @Marascax!)
    • Add nunique support for DataFrameGroupBy (#2667)

Enhancements

  • Add support for HTTP request rewriter (#2665)
  • Add merging small files support for md.{read_parquet, read_csv} (#2669)
  • Optimize filtering DataFrame with its fields (#2668)

Bug fixes

  • Allow specifying multiple supervisor processes (#2625)
  • Fix backward compatibility for pandas 1.0 (#2630)
  • Fix NotImplementedError for mo.batch when single call not implemented (#2637)
  • Fix compatibility for pandas 1.4 (#2652)
  • Fix IndexError raise by aggregation of DataFrameGroupBy (#2653)
  • Fix df.loc[:] to make sure same index_value key generated (#2654)
  • Fix aggregation with comparison (#2655)
  • Fix the wrong index_value generated by df.loc[:] (#2666)
  • Fix as_index when calling groupby-agg (#2678)

v0.9.0a1

16 Dec 06:49
e2f3f3b
Compare
Choose a tag to compare
v0.9.0a1 Pre-release
Pre-release

This is the release notes of v0.9.0a1. See here for the complete list of solved issues and merged PRs.

New Features

  • Tensor
    • Implements mt.bincount (#2548)
  • DataFrame
  • Learn
    • Add mars.learn.metrics.multilabel_confusion_matrix and derivative metrics (#2554)
  • Services
    • Add basic profiling support for supervisor (#2586)

Enhancements

  • Add app_queue in new_cluster (#2550, thanks @xxxxsk!)
  • Implement web API of get_infos (#2558)
  • Reduce time cost of cpu_percent() calls (#2567)
  • Reduce estimation time cost (#2577)
  • [ray] refine mars on ray usability (#2580)
  • [ray] Refine raydataset integration (#2579)
  • Optimize tileable graph construction (#2583)
  • Stop calling user funcs when dtypes is specified (#2587)
  • Supports adding Mars extensions via setup entrypoints (#2589)
  • Skip details of shuffled chunks in meta (#2600)
  • Reduce the time cost of fetching tileable data (#2594)
  • Use batched request to apply for slots (#2601)
  • Reduce RPC cost of oscar by removing unnecessary tasks (#2597)

Bug fixes

  • Fix index series.apply when result index unchanged (#2557)
  • Stop using asdict to handle dataclasses (#2561)
  • Fix tests under cudf 21.10 (#2608)
  • Fix DataFrame getitem when exists duplicate columns (#2581)
  • Upgrade required version of vineyard. (#2588)
  • Fix progress always is 0 or 100% (#2591)
  • Make Proxima work with latest Mars (#2599, thanks @yuyiming!)
  • Fix None dtype for some unary tensor functions (#2603)
  • Fix duplicate decref of subtask input chunk (#2611, thanks @Catch-Bull!)

Documentation

  • Add a document about how to implement a Mars operand (#2562)

v0.8.0

16 Dec 07:00
92d4959
Compare
Choose a tag to compare

This is the release notes of v0.8.0. See here for the complete list of solved issues and merged PRs.

This release note only covers the difference from v0.8.0rc1; for all highlights and changes, please refer to the release notes of the pre-releases:

alpha1
alpha2
alpha3
beta1
beta2
rc1

New Features

  • Tensor
    • Implements mt.bincount (#2552)
  • DataFrame
  • Learn
    • Add mars.learn.metrics.multilabel_confusion_matrix and derivative metrics (#2568)

Enhancements

  • Implement web API of get_infos (#2564)
  • Reduce time cost of cpu_percent() calls (#2572)
  • Stop calling user funcs when dtypes is specified (#2596)
  • Supports adding Mars extensions via setup entrypoints (#2598)
  • [Ray] Refine mars on ray usability (#2606)
  • Reduce estimation time cost (#2607)
  • Skip details of shuffled chunks in meta (#2609)
  • Reduce the time cost of fetching tileable data (#2616)
  • Reduce RPC cost of oscar by removing unnecessary tasks (#2613)
  • Use batched request to apply for slots (#2615)

Bug fixes

  • Fix index series.apply when result index unchanged (#2563)
  • Fix DataFrame getitem when exists duplicate columns (#2582)
  • Upgrade required version of vineyard (#2593)
  • Fix progress always is 0 or 100% (#2595)
  • Fix None dtype for some unary tensor functions (#2604)
  • Make Proxima work with latest Mars (#2605, thanks @yuyiming!)
  • Fix tests for cudf 21.10 (#2608)
  • Fix duplicate decref of subtask input chunk (#2614, thanks @Catch-Bull!)

v0.8.0rc1

23 Oct 11:26
18a7d1e
Compare
Choose a tag to compare
v0.8.0rc1 Pre-release
Pre-release

This is the release notes of v0.8.0rc1. See here for the complete list of solved issues and merged PRs.

New Features

  • Tensor
    • Add preliminary implementations for ufunc methods (#2510)
    • Add partial support for setitem with fancy indexing (#2453)
  • DataFrame
  • Learn
    • Add make_regression support for learn module (#2515)
    • Implements fit and predict methods for bagging (#2516)
    • Implements mars.learn.ensemble.IsolationForest (#2531)
    • Implements mars.learn.preprocessor.LabelEncoder (#2542)
  • Services
    • Add web API for scheduling (#2533)
  • Web
  • Others
    • Support mutable tensor on oscar (#2432, thanks @Coco58323!)
    • Add experimental support for CUDA under WSL for Windows 11 (#2538)

Enhancements

  • Use black to enforce code style (#2492)
  • Reduce indentation of frontend code (#2540)

Bug fixes

  • Fix output of df.groupby(as_index=False).size() (#2507)
  • [Ray] Fix web serialize lambda (#2512)
  • Fix reduction result on empty series (#2520)
  • Fix DataFrame.loc when df is empty (#2524)
  • Fix df.loc when providing empty list (#2528)

Documentation

v0.7.5

23 Oct 17:44
89a754d
Compare
Choose a tag to compare

This is the release notes of v0.7.5. See here for the complete list of solved issues and merged PRs.

New Features

  • Tensor
    • Add preliminary implementations for ufunc methods (#2513)
    • Add partial support for setitem with fancy indexing (#2544)
  • DataFrame
  • Learn
    • Add make_regression support for learn module (#2517)
    • Implements mars.learn.preprocessor.LabelEncoder (#2545)
  • Services
    • Add web API for scheduling (#2535)
  • Web
  • Others
    • Add experimental support for CUDA under WSL for Windows 11 (#2543)

Enhancements

  • Reduce indentation of frontend code (#2541)

Bug fixes

  • Fix output of df.groupby(as_index=False).size() (#2508)
  • Fix reduction result on empty series (#2522)
  • Fix df.loc when df is empty (#2526)
  • [Ray] Fix serializing lambdas in web (#2529)
  • Fix df.loc when providing empty list (#2532)

Documentation