Skip to content

Commit

Permalink
Different Mel Band Roformer models were added.
Browse files Browse the repository at this point in the history
  • Loading branch information
ZFTurbo committed Nov 28, 2024
1 parent 3849abf commit aef04b2
Show file tree
Hide file tree
Showing 2 changed files with 26 additions and 1 deletion.
21 changes: 21 additions & 0 deletions docs/mel_roformer_experiments.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
## Mel Roformer models

All experiments were made using MUSDB18HQ dataset. All metrics were measured using 'test' set. Training was made using 'train' set.

### Experiments table

| Average SDR Score | Chunk size | Depth | Dim | mlp expansion factor | Skip connection | Hop size | FFT Size | Dropout | Batch Size | DL Checkpoint | Comment |
|:-----------------:|:-------------:|:-----------------:|:---:|:--------------------:|:-----:|:-----:|:-----:|:-----:|:----------:|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|:-----------------------------:|
| 5.1235 | 88200 | 2 | 64 | 1 | No | 441 | 2048 | 0/0 | 32 (48 GB) | [Config](https://github.com/ZFTurbo/Music-Source-Separation-Training/releases/download/v1.0.11/model_mel_band_roformer_experimental_ep_53_sdr_5.1235_config_mel_64_2_1_88200_experimental.yaml) / [Weights](https://github.com/ZFTurbo/Music-Source-Separation-Training/releases/download/v1.0.11/model_mel_band_roformer_experimental_ep_53_sdr_5.1235.ckpt) | |
| 6.4698 | 88200 | 4 | 128 | 1 | No | 441 | 2048 | 0.1/0.1 | 28 (80 GB) | [Config](https://github.com/ZFTurbo/Music-Source-Separation-Training/releases/download/v1.0.11/model_mel_band_roformer_ep_38_sdr_6.4698.yaml) / [Weights](https://github.com/ZFTurbo/Music-Source-Separation-Training/releases/download/v1.0.11/model_mel_band_roformer_ep_38_sdr_6.4698.ckpt) | |
| 6.7022 | 88200 | 4 | 128 | 1 | No | 882 | 4096 | 0/0 | 20 (80 GB) | [Config](https://github.com/ZFTurbo/Music-Source-Separation-Training/releases/download/v1.0.11/model_mel_band_roformer_experimental_ep_166_sdr_6.7022_config_mel_128_4_1_88200_big_fft_4096.yaml) / [Weights](https://github.com/ZFTurbo/Music-Source-Separation-Training/releases/download/v1.0.11/model_mel_band_roformer_experimental_ep_166_sdr_6.7022.ckpt) | |
| 7.8127 | 88200 | 6 | 256 | 1 | Yes | 441 | 2048 | 0.1/0.1 | 16 (80 GB) | [Config](https://github.com/ZFTurbo/Music-Source-Separation-Training/releases/download/v1.0.11/model_mel_band_roformer_ep_168_sdr_7.8127_config_mel_256_6_1_88200.yaml) / [Weights](https://github.com/ZFTurbo/Music-Source-Separation-Training/releases/download/v1.0.11/model_mel_band_roformer_ep_168_sdr_7.8127.ckpt) | |
| 6.4908 | 176400 | 4 | 128 | 1 | Yes | 441 | 2048 | 0.1/0.1 | 8 (48 GB) | [Config](https://github.com/ZFTurbo/Music-Source-Separation-Training/releases/download/v1.0.11/model_mel_band_roformer_experimental_ep_15_sdr_6.4908.yaml) / [Weights](https://github.com/ZFTurbo/Music-Source-Separation-Training/releases/download/v1.0.11/model_mel_band_roformer_experimental_ep_15_sdr_6.4908.ckpt) | |
| 6.5224 | 176400 | 4 | 128 | 2 | Yes | 441 | 2048 | 0.1/0.1 | 8 (48 GB) | [Config](https://github.com/ZFTurbo/Music-Source-Separation-Training/releases/download/v1.0.11/model_mel_band_roformer_experimental_ep_9_sdr_6.5254.yaml) / [Weights](https://github.com/ZFTurbo/Music-Source-Separation-Training/releases/download/v1.0.11/model_mel_band_roformer_experimental_ep_9_sdr_6.5254.ckpt) | |
| 7.0412 | 352800 | 4 | 128 | 1 | No | 882 | 4096 | 0/0 | 5 (80 GB) | [Config](https://github.com/ZFTurbo/Music-Source-Separation-Training/releases/download/v1.0.11/model_mel_band_roformer_experimental_ep_48_sdr_7.0412_config_mel_128_4_1_352800_big_fft_4096.yaml) / [Weights](https://github.com/ZFTurbo/Music-Source-Separation-Training/releases/download/v1.0.11/model_mel_band_roformer_experimental_ep_48_sdr_7.0412.ckpt) | |
| 8.2175 | 352800 | 4 | 256 | 1 | No | 441 | 2048 | 0/0 | 5 (80 GB) | [Config](https://github.com/ZFTurbo/Music-Source-Separation-Training/releases/download/v1.0.11/model_mel_band_roformer_ep_1_sdr_8.2175.yaml) / [Weights](https://github.com/ZFTurbo/Music-Source-Separation-Training/releases/download/v1.0.11/model_mel_band_roformer_ep_1_sdr_8.2175.ckpt) | Trained longer on different strategies. Looks like it a bit overfit in the end |
| 1.0557 | 352800 | 4 | 128 | 1 | No | 882 | 2048 | 0/0 | 6 (48 GB) | --- | Looks like big hop size is not great |
| 6.8652 | 485100 | 4 | 128 | 1 | No | 441 | 2048 | 0.1/0.1 | 5 (48 GB) | [Config](https://github.com/ZFTurbo/Music-Source-Separation-Training/releases/download/v1.0.11/model_mel_band_roformer_ep_7_sdr_6.8652.yaml) / [Weights](https://github.com/ZFTurbo/Music-Source-Separation-Training/releases/download/v1.0.11/model_mel_band_roformer_ep_7_sdr_6.8652.ckpt) | |
| 8.9400* | 485100 | 8 | 384 | 4 | Yes | 882 | 4096 | 0/0 | 2 (80 GB) | [Config](https://github.com/ZFTurbo/Music-Source-Separation-Training/releases/download/v1.0.11/model_mel_band_roformer_ep_5_sdr_8.9443_config_mel_384_8_4_485100_big_fft_4096_skip_connect.yaml) / Weights ([part 1](https://github.com/ZFTurbo/Music-Source-Separation-Training/releases/download/v1.0.11/model_mel_band_roformer_ep_5_sdr_8.9443.zip.001), [part2](https://github.com/ZFTurbo/Music-Source-Separation-Training/releases/download/v1.0.11/model_mel_band_roformer_ep_5_sdr_8.9443.zip.002)) | Very big file with weights > 3GB. Continue to increase metrics |

* Note 1: Some models probably undertrained
6 changes: 5 additions & 1 deletion docs/pretrained_models.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,4 +52,8 @@ If you trained some good models, please, share them. You can post config and mod
| SCNet Large (by [starrytong](https://github.com/starrytong)) ~~*~~ | bass / drums / vocals / other | MUSDB test avg: 9.70 (bass: 9.38, drums: 11.15 vocals: 10.94 other: 7.31) Multisong avg: 9.28 (bass: 11.27, drums: 11.23 vocals: 9.05 other: 5.57) | [Config](https://github.com/ZFTurbo/Music-Source-Separation-Training/releases/download/v1.0.9/config_musdb18_scnet_large_starrytong.yaml) | [Weights](https://github.com/ZFTurbo/Music-Source-Separation-Training/releases/download/v1.0.9/SCNet-large_starrytong_fixed.ckpt) |
| TS BS Mamba2 ~~*~~ | bass / drums / vocals / other | MUSDB test avg: 6.87 (bass: 5.82, drums: 8.14 vocals: 8.35 other: 5.16) Multisong avg: 6.66 (bass: 7.87, drums: 7.92 vocals: 7.01 other: 3.85) | [Config](https://github.com/ZFTurbo/Music-Source-Separation-Training/releases/download/v1.0.9/config_musdb18_bs_mamba2.yaml) | [Weights](https://github.com/ZFTurbo/Music-Source-Separation-Training/releases/download/v1.0.9/model_bs_mamba2_ep_11_sdr_6.8723.ckpt) |

~~*~~ **Note**: Model was trained only on MUSDB18HQ dataset (100 songs train data)
~~*~~ **Note**: Model was trained only on MUSDB18HQ dataset (100 songs train data)

### MelRoformer models

[Table of Mel Band Roformers with different paramers](docs/mel_roformer_experiments.md)

0 comments on commit aef04b2

Please sign in to comment.