Asteroid is a Pytorch-based audio source separation toolkit that enables fast experimentation on common datasets. It comes with a source code that supports a large range of datasets and architectures, and a set of recipes to reproduce some important papers.
Please, if you have found a bug, open an issue,
if you solved it, open a pull request !
Same goes for new features, tell us what you want or help us building it !
Don't hesitate to join the slack
and ask questions / suggest new features there as well !
Asteroid is intended to be a community-based project
so hop on and help us !
- Installation
- Tutorials
- Running a recipe
- Available recipes
- Supported datasets
- Pretrained models
- Calls for contributions
- Citing us
(↑up to contents)
In order to install Asteroid, clone the repo and install it using
pip or python :
git clone https://github.com/mpariente/asteroid
cd asteroid
# Install install-required deps
pip install numpy Cython
# Install with pip in editable mode
pip install -e .
# Or, install with python in dev mode
# python setup.py develop
Asteroid is also on PyPI, you can install the latest release with
pip install numpy Cython
pip install asteroid
(↑up to contents)
Here is a list of notebooks showing example usage of Asteroid's features.
(↑up to contents)
Running the recipes requires additional packages in most cases,
we recommend running :
# from asteroid/
pip install -r requirements.txt
Then choose the recipe you want to run and run it !
cd egs/wham/ConvTasNet
. ./run.sh
More information in egs/README.md.
- ConvTasnet (Luo et al.)
- Tasnet (Luo et al.)
- Deep clustering (Hershey et al. and Isik et al.)
- Chimera ++ (Luo et al. and Wang et al.)
- DualPathRNN (Luo et al.)
- Two step learning(Tzinis et al.)
- Open-Unmix (coming) (Stöter et al.)
- Wavesplit (coming) (Zeghidour et al.)
- WSJ0-2mix / WSJ03mix (Hershey et al.)
- WHAM (Wichern et al.)
- WHAMR (Maciejewski et al.)
- LibriMix (Cosentino et al.)
- Microsoft DNS Challenge (Chandan et al.)
- SMS_WSJ (Drude et al.)
- MUSDB18 (egs coming) (Raffi et al.)
- FUSS (egs coming) (Wisdom et al.)
- AVSpeech (Ephrat et al.)
- Kinect-WSJ (Sivasankaran et al.)
(↑up to contents)
Asteroid provides pretrained models through the Asteroid community in Zenodo.
Loading a pretrained model is super simple !
from asteroid.models import ConvTasNet
model = ConvTasNet.from_pretrained('mpariente/ConvTasNet_WHAM!_sepclean')
Have a look at the Zenodo page or at the model cards to choose which model you want to load.
You can also load it with Hub
from torch import hub
model = hub.load('mpariente/asteroid', 'conv_tasnet', 'mpariente/ConvTasNet_WHAM!_sepclean')
Enjoy having pretrained models? Please share your models if you train some, we made it simple
with the asteroid-upload
CLI, check the next sections.
At the end of each sharing-enabled recipe, all the necessary infos are gathered into a file, the only thing that's left to do is to run
asteroid-upload exp/your_exp_dir/publish_dir --uploader "Name Here"
Ok, not really. First you need to register to Zenodo (Sign in with GitHub ok),
create a token and use it with
the --token
option of the CLI, or by setting the ACCESS_TOKEN
environment variable.
If you plan to upload more models (and you should 😇), you can fill in your infos in
uploader_info.yml
at the root, like this.
uploader: Manuel Pariente
affiliation: INRIA
git_username: mpariente
token: TOKEN_HERE
(↑up to contents)
We are always looking to expand our coverage of the source separation
and speech enhancement research, the following is a list of
things we're missing.
You want to contribute? This is a great place to start !
- Wavesplit (Zeghidour and Grangier)
- FurcaNeXt (Shi et al.)
- DeepCASA (Liu and Want)
- VCTK Test sets from Kadioglu et al.
- Interrupted and cascaded PIT (Yang et al.)
Consistency contraints (Wisdom et al.)Backpropagable STOI and PESQ.- Parametrized filterbanks from Tukuljac et al.
End-to-End MISI (Wang et al.)
Don't forget to read our contributing guidelines.
You can also open an issue or make a PR to add something we missed in this list.
The default logger is TensorBoard in all the recipes. From the recipe folder, you can run the following to visualize the logs of all your runs. You can also compare different systems on the same dataset by running a similar command from the dataset directiories.
# Launch tensorboard (default port is 6006)
tensorboard --logdir exp/ --port tf_port
If your launching tensorboard remotely, you should open an ssh tunnel
# Open port-forwarding connection. Add -Nf option not to open remote.
ssh -L local_port:localhost:tf_port user@ip
Then open http://localhost:local_port/
. If both ports are the same, you can
click on the tensorboard URL given on the remote, it's just more practical.
- Modularity. Building blocks are thought and designed to be seamlessly plugged together. Filterbanks, encoders, maskers, decoders and losses are all common building blocks that can be combined in a flexible way to create new systems.
- Extensibility. Extending Asteroid with new features is simple. Add a new filterbank, separator architecture, dataset or even recipe very easily.
- Reproducibility. Recipes provide an easy way to reproduce results with data preparation, system design, training and evaluation in a single script. This is an essential tool for the community !
(↑up to contents)
If you loved using Asteroid and you want to cite us, use this :
@article{Pariente2020Asteroid,
title={Asteroid: the {PyTorch}-based audio source separation toolkit for researchers},
author={Manuel Pariente and Samuele Cornell and Joris Cosentino and Sunit Sivasankaran and
Efthymios Tzinis and Jens Heitkaemper and Michel Olvera and Fabian-Robert Stöter and
Mathieu Hu and Juan M. Martín-Doñas and David Ditter and Ariel Frank and Antoine Deleforge
and Emmanuel Vincent},
year={2020},
journal={arXiv preprint arXiv:2005.04132},
primaryClass={eess.AS}
}