Multimodal Emotion Recognition

This repository contains a project realized as part of the Natural Language Processing exam of the Master's degree in Artificial Intelligence, University of Bologna.

Project structure

The project is structured as follows:

├───checkpoints # Model checkpoints
│   ├───audio_mel # Feature extractor audio_mel model
│   ├───audio_wav2vec2 # Feature extractor audio_wav2vec2 model
│   ├───text # Feature extractor text model (roberta-base and roberta-large)
│   └───m2fnet # M2FNet model
├───data
│   └───MELD.Raw
│       ├───dev_splits_complete # Validation set
│       │   ├───mel_spectrograms
│       │   └───wav
│       ├───output_repeated_splits_test # Test set
│       │   ├───mel_spectrograms
│       │   └───wav
│       └───train_splits # Training set
│           ├───mel_spectrograms
│           └───wav
├───embeddings # Embeddings coming from the feature extractor models
│   ├───audio_mel # audio: original
│   ├───audio_wav2vec2 # audio: Wav2Vec2.0
│   ├───text_base # text: roberta-base
│   └───text_large # text: roberta-large
├───paper # paper references
├───scripts # bash scripts for dataset download, audio extraction and venv creation
└───src
    └───feature_extractors # Feature extractors models and training scripts
        ├───audio_mel
        │   └───losses # Adaptive triplet based loss function
        ├───audio_wav2vec2
        └───text
	# M2FNet model and training scripts
    config.yaml
    dataset.py
    model.py
    test.py
    train.py
    utils.py

In every subfolder of src there is a config.yaml file that contains the configurations of the corresponding model.

Prerequisites

This project was developed in Python3 and pytorch. Run the following command to install the prerequisites:

# Linux
pip install --no-cache -r ./requirements_linux.txt

# Windows
pip install --no-cache -r ./requirements.txt

Otherwise, you can build a ready-to-go virtual environment by running the following scripts from the project's folder:

# Linux
> ./scripts/build-venv.sh

# Windows
> .\scripts\build-venv.bat

Download FFMPEG

Download ffmpeg from here. Follow online tutorials to install it correctly based on your OS.

Download dataset

Now, you need to download and prepare the dataset. Run the following commands from the project's folder:

# Linux
> ./scripts/MELD_download.sh # Download dataset
> ./scripts/video2wav.sh # Extract audio

# Windows
> .\scripts\MELD_download.bat # Download dataset
> .\scripts\video2wav.bat # Extract audio

Group members

Reg No.	Name	Surname	Email	Username
1005278	Ludovico	Granata	`[email protected]`	LudovicoGranata
973719	Parsa	Dahesh	`[email protected]`	ParsaD23
984854	Simone	Persiani	`[email protected]`	iosonopersia

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multimodal Emotion Recognition

Project structure

Prerequisites

Download FFMPEG

Download dataset

Group members

About

Releases

Packages

Contributors 3

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 236 Commits
embeddings		embeddings
paper		paper
scripts		scripts
src		src
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
requirements_linux.txt		requirements_linux.txt

iosonopersia/Multimodal-Emotion-Recognition

Folders and files

Latest commit

History

Repository files navigation

Multimodal Emotion Recognition

Project structure

Prerequisites

Download FFMPEG

Download dataset

Group members

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages