DM1 - Data Mining: Foundations

In this repository, I, and my group member, implemented the basic concepts of Data Mining on the RAVDESS dataset taught by Prof. Riccardo Guidotti in DM1 - Data Mining: Foundations course at Università di Pisa for the year 2022/23.

Dataset - RAVDESS

The dataset was created from the RAVDESS dataset (https://zenodo.org/record/1188976) extracting basic statistics (mean, std, min, max, etc.) from the original audio data and after transforming it using: zero-crossing rate, Mel-Frequency Cepstral Coefficients, spectral centroid, and the stft chromagram. Features were extracted from the 2452 wav files.

Features Description

modality (audio-only)
vocal_channel (speech, song)
emotion (neutral, calm, happy, sad, angry, fearful, disgust, surprised)
emotional_intensity (normal, strong). NOTE: There is no strong intensity for the 'neutral' emotion
statement ("Kids are talking by the door", "Dogs are sitting by the door")
repetition (1st repetition, 2nd repetition)
actor (01 to 24)
sex (M, F)
channels (number of channels; 1 for mono, 2 for stereo audio)
sample_width (number of bytes per sample; 1 means 8-bit, 2 means 16-bit)
frame_rate (frequency of samples used (in Hertz))
frame_width (Number of bytes for each frame. One frame contains a sample for each channel.)
length_ms (audio file length (in milliseconds))
frame_count (the number of frames from the sample)
intensity (loudness in dBFS (dB relative to the maximum possible loudness))
zero_crossings_sum (sum of the zero-crossing rate)
'mean', 'std', 'min', 'max', 'kur', 'skew' (statistics of the original audio signal)
mdcc 'mean', 'std', 'min', 'max' (statistics of the Mel-Frequency Cepstral Coefficients)
sc_ 'mean', 'std', 'min', 'max', 'kur', 'skew' (statistics of the spectral centroid)
stft_ 'mean', 'std', 'min', 'max', 'kur', 'skew' (statistics of the stft chromagram)

Learning Outcomes

Fundamental concepts of data knowledge and discovery.
Data understanding
Data preparation
Clustering
Classification
Pattern Mining and Association Rules
Regression

Collaborators

Hafiz Muhammad Umer
Nimra Nawaz

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
Classification - DT, KNN, Naive Bayes		Classification - DT, KNN, Naive Bayes
Clustering - K-meas, DBSCAN, Hierarchical		Clustering - K-meas, DBSCAN, Hierarchical
Data Understanding and Preparation		Data Understanding and Preparation
Dataset		Dataset
Pattern Mining - Apriori, FP-Growth		Pattern Mining - Apriori, FP-Growth
Regression - Linear, Lasso, Ridge		Regression - Linear, Lasso, Ridge
DM1-Report-Final.pdf		DM1-Report-Final.pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DM1 - Data Mining: Foundations

Dataset - RAVDESS

Features Description

Learning Outcomes

Collaborators

About

Releases

Packages

Contributors 2

Languages

umer7267/DM1-Data-Mining-Foundations

Folders and files

Latest commit

History

Repository files navigation

DM1 - Data Mining: Foundations

Dataset - RAVDESS

Features Description

Learning Outcomes

Collaborators

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages