In this repository, I, and my group member, implemented the basic concepts of Data Mining on the RAVDESS dataset taught by Prof. Riccardo Guidotti in DM1 - Data Mining: Foundations course at Università di Pisa for the year 2022/23.
The dataset was created from the RAVDESS dataset (https://zenodo.org/record/1188976) extracting basic statistics (mean, std, min, max, etc.) from the original audio data and after transforming it using: zero-crossing rate, Mel-Frequency Cepstral Coefficients, spectral centroid, and the stft chromagram. Features were extracted from the 2452 wav files.
- modality (audio-only)
- vocal_channel (speech, song)
- emotion (neutral, calm, happy, sad, angry, fearful, disgust, surprised)
- emotional_intensity (normal, strong). NOTE: There is no strong intensity for the 'neutral' emotion
- statement ("Kids are talking by the door", "Dogs are sitting by the door")
- repetition (1st repetition, 2nd repetition)
- actor (01 to 24)
- sex (M, F)
- channels (number of channels; 1 for mono, 2 for stereo audio)
- sample_width (number of bytes per sample; 1 means 8-bit, 2 means 16-bit)
- frame_rate (frequency of samples used (in Hertz))
- frame_width (Number of bytes for each frame. One frame contains a sample for each channel.)
- length_ms (audio file length (in milliseconds))
- frame_count (the number of frames from the sample)
- intensity (loudness in dBFS (dB relative to the maximum possible loudness))
- zero_crossings_sum (sum of the zero-crossing rate)
- 'mean', 'std', 'min', 'max', 'kur', 'skew' (statistics of the original audio signal)
- mdcc 'mean', 'std', 'min', 'max' (statistics of the Mel-Frequency Cepstral Coefficients)
- sc_ 'mean', 'std', 'min', 'max', 'kur', 'skew' (statistics of the spectral centroid)
- stft_ 'mean', 'std', 'min', 'max', 'kur', 'skew' (statistics of the stft chromagram)
- Fundamental concepts of data knowledge and discovery.
- Data understanding
- Data preparation
- Clustering
- Classification
- Pattern Mining and Association Rules
- Regression