Releases: mars-project/mars
v0.6.7
v0.7.0a7
This is the release notes of v0.7.0a7. See here for the complete list of solved issues and merged PRs.
New Features
- DataFrame
- Implements
{DataFrame, Series}.pct_change
(#2014)
- Implements
- Tensor
- Implements tree arithmetic for tensor add and multiplication (#2024)
Project Galois
- Oscar
- Service
- Add initial service implementations (#2010)
Enhancements
- Use mmap files to reduce memory usage in proxima builder (#1866)
- Support setting column with different index for DataFrame (#2020)
Bug fixes
v0.6.6
This is the release notes of v0.6.6. See here for the complete list of solved issues and merged PRs.
New Features
- DataFrame
- Implements
{DataFrame, Series}.pct_change
(#2015)
- Implements
- Tensor
- Implements tree arithmetic for tensor add and multiplication (#2028)
Enhancements
- Use mmap files to reduce memory usage in proxima builder (#2016)
- Support setting column with different index for DataFrame (#2025)
Bug fixes
- Fix IndexError in
Series.sort_values
when some chunk is empty (#2001) - Fix mars crashes on ray >= 1.2.0 (#2003, thanks @fyrestone!)
- Add
errors
argument forgroupby.sample
to ignore errors when group size less thann
(#2007) - Fix errors when calling
where()
on reshape results (#2012) - Fix log error when yielding to another remote (#2026)
v0.7.0a6
This is the release notes of v0.7.0a6. See here for the complete list of solved issues and merged PRs.
New Features
- DataFrame
- Tensor
Project Galois
- Oscar
- Storage
- [storage][vineyard] Implement storage lib of vineyard backend (#1952, thanks @acezen!)
- [storage][shared_memory] Add storage backend of
multiprocessing.shared_memory
(#1969) - [storage][cuda] Add cuda backend storage implementation (#1981)
- [storage][ray] Implements Ray storage (#1992, thanks @fyrestone!)
Enhancements
- Allow wrapping existing models with Mars class constructors (#1956)
- Optimize performance of
DataFrame.describe()
(#1961) - Initialize
filesystem
andaio
libs (#1980)
Bug fixes
- Fix
MarsDMatrix
when input tensor has unknown chunk shape (#1966) - Fix tensor sorting with empty chunks (#1968)
- Re-enable the from/to vineyard test cases, and set meta for tensor/dataframe properly. (#1967)
- Fix ValueError when reducing tensors with empty chunks (#1978)
- Fix job hang when error message can't be pickled (#1990)
- Fix IndexError in
Series.sort_values
when some chunk is empty (#1999) - Fix mars crashes on ray >= 1.2.0 (#1998, thanks @fyrestone!)
- Add
errors
argument for groupby.sample to ignore errors when group size less thann
(#2002)
v0.6.5
This is the release notes of v0.6.5. See here for the complete list of solved issues and merged PRs.
New Features
- DataFrame
- Tensor
Enhancements
- Allow wrapping existing models with Mars class constructors (#1957)
- Optimize performance of
DataFrame.describe()
(#1962) - Initialize
filesystem
libs (#1982)
Bug fixes
- Fix tensor sorting with empty chunks (#1973)
- Fix
MarsDMatrix
when input tensor has unknown chunk shape (#1970) - Fix ValueError when reducing tensors with empty chunks (#1979)
- Fix job hang when error message can't be pickled (#1993)
Tests
- Add tests and releases for Python 3.9 (#1955)
v0.7.0a5
This is the release notes of v0.7.0a5. See here for the complete list of solved issues and merged PRs.
New Features
- DataFrame
Project Galois
- [oscar] Add actor driver & structure adjustment (#1925)
- [oscar][ray backend] Actor creation (#1916, thanks @fyrestone!)
- Add new serializer implementation (#1937)
- Implement storage lib of Arrow plasma as well as disk (#1904)
Enhancements
- Allow set verify_ssl to False for kubernetes configuration (#1911)
- Optimize generating mock DataFrames (#1913)
- Move opcodes out of protobuf definition (#1944)
Bug fixes
- To vineyard: avoid copy when chunks are already in vineyard (vineyard is the backend). (#1899)
- Fix rechunk when input tileable has unknown shape (#1912)
- Fix KeyError when comparing series (#1920)
- Fix rechunk when chunks have different dtypes that cannot compare (#1922)
- Collect available ports before running LightGBM task (#1927)
- Fix KeyError when column pruning is applied (#1929)
- Fix shuffling data in mars.learn module (#1931)
- Fix memory estimation of StartTracker for XGBoost (#1934)
- Fix
accuracy_score
for distributed execution (#1945)
Tests
- Add tests and releases for Python 3.9 (#1954)
v0.6.4
This is the release notes of v0.6.4. See here for the complete list of solved issues and merged PRs.
New Features
- DataFrame
Enhancements
Bug fixes
- Fix rechunk when input tileable has unknown shape (#1914)
- Fix KeyError when comparing series (#1921)
- Fix rechunk when chunks have different dtypes that cannot compare (#1926)
- Collect available ports before running LightGBM task (#1927)
- Fix KeyError when column pruning is applied (#1933)
- Fix error when shuffling data in
mars.learn
module (#1936) - Fix memory estimation of StartTracker for XGBoost (#1936)
- Fix
accuracy_score
for distributed execution (#1948)
v0.7.0a4
This is the release notes of v0.7.0a4. See here for the complete list of solved issues and merged PRs.
New Features
- DataFrame
Enhancements
- Allow internal serialization to use JSON (#1880)
- Optimize performance of
{md.read_csv(), md.read_parquet()}.head()
(#1878) - Optimize performance of
df.sort_values().head()
(#1884) - Support column pruning for
groupby().agg()
on data sources (#1886) - Improve
named_{dataframe, series, tensor}
that it's able to get more meta (#1896)
Bug fixes
- Support unknown shape for
mt.reshape
,mt.histogram
andmd.DataFrame
(#1869) - Fix wrongly raised error: Tileable object must be executed first before being fetched (#1872)
- Fix reshape when input tensor has unknown shape and 1 chunk (#1874)
- Fix stuck of threaded actor operations in gevent==20.12.0 (#1879)
- Fix sorting string columns with None value & sorting with empty chunks (#1891)
- Adapt
vineyardhandler.py
to latest vineyard. (#1887)
Documentation
- LFAI & Data: Add required documents (#1865)
v0.6.3
This is the release notes of v0.6.3. See here for the complete list of solved issues and merged PRs.
New Features
- DataFrame
Enhancements
- Allow internal serialization to use JSON (#1882)
- Optimize performance of
{md.read_csv(), md.read_parquet()}.head()
(#1883) - Optimize performance of
df.sort_values().head()
(#1888) - Support column pruning for groupby().agg() on data sources (#1889)
- Improve
named_{dataframe, series, tensor}
that it's able to get more meta (#1897)
Bug fixes
- Fix wrongly raised error: Tileable object must be executed first before being fetched (#1875)
- Support unknown shape for
mt.reshape
,mt.histogram
andmd.DataFrame
(#1876) - Fix stuck of threaded actor operations in gevent==20.12.0 (#1881)
- Fix sorting string columns with None value & sorting with empty chunks (#1893)
v0.6.2
This is the release notes of v0.6.2. See here for the complete list of solved issues and merged PRs.
New Features
- DataFrame
- Implements
head()
on groupby objects (#1851)
- Implements
- Learn
- Implements
mars.learn.preprocessing.{MinMaxScaler, minmax_scale}
(#1858)
- Implements
Enhancements
- Improve Proxima
recall_by_id
computation method (#1807, thanks @rg070836rg!) - Revise to/from vineyard, of Tensor and DataFrame. (#1806)
- Add memory estimation for
read_parquet
as well asread_csv
(#1815) - Support using compound agg function in lambda (#1819)
- Add
incremental_index
argument toreset_index
which by default is False (#1842) - Support
to_pandas
in a batch way for DataFrame and Series (#1859) - Support specifying memory scale in kubernetes (#1861)