This repository contains a project realized as part of the Natural Language Processing exam of the Master's degree in Artificial Intelligence, University of Bologna.
The project is structured as follows:
├───checkpoints # Model checkpoints
│ ├───audio_mel # Feature extractor audio_mel model
│ ├───audio_wav2vec2 # Feature extractor audio_wav2vec2 model
│ ├───text # Feature extractor text model (roberta-base and roberta-large)
│ └───m2fnet # M2FNet model
├───data
│ └───MELD.Raw
│ ├───dev_splits_complete # Validation set
│ │ ├───mel_spectrograms
│ │ └───wav
│ ├───output_repeated_splits_test # Test set
│ │ ├───mel_spectrograms
│ │ └───wav
│ └───train_splits # Training set
│ ├───mel_spectrograms
│ └───wav
├───embeddings # Embeddings coming from the feature extractor models
│ ├───audio_mel # audio: original
│ ├───audio_wav2vec2 # audio: Wav2Vec2.0
│ ├───text_base # text: roberta-base
│ └───text_large # text: roberta-large
├───paper # paper references
├───scripts # bash scripts for dataset download, audio extraction and venv creation
└───src
└───feature_extractors # Feature extractors models and training scripts
├───audio_mel
│ └───losses # Adaptive triplet based loss function
├───audio_wav2vec2
└───text
# M2FNet model and training scripts
config.yaml
dataset.py
model.py
test.py
train.py
utils.py
In every subfolder of src
there is a config.yaml
file that contains the configurations of the corresponding model.
This project was developed in Python3
and pytorch
. Run the following command to install the prerequisites:
# Linux
pip install --no-cache -r ./requirements_linux.txt
# Windows
pip install --no-cache -r ./requirements.txt
Otherwise, you can build a ready-to-go virtual environment by running the following scripts from the project's folder:
# Linux
> ./scripts/build-venv.sh
# Windows
> .\scripts\build-venv.bat
Download ffmpeg
from here. Follow online tutorials to install it correctly based on your OS.
Now, you need to download and prepare the dataset. Run the following commands from the project's folder:
# Linux
> ./scripts/MELD_download.sh # Download dataset
> ./scripts/video2wav.sh # Extract audio
# Windows
> .\scripts\MELD_download.bat # Download dataset
> .\scripts\video2wav.bat # Extract audio
Reg No. | Name | Surname | Username | |
---|---|---|---|---|
1005278 | Ludovico | Granata | [email protected] |
LudovicoGranata |
973719 | Parsa | Dahesh | [email protected] |
ParsaD23 |
984854 | Simone | Persiani | [email protected] |
iosonopersia |