Skip to content

Using ML algorithms to curate a song playlist using similarity factors from a music dataset obtained from Spotify

License

Notifications You must be signed in to change notification settings

tanaynayak/music-mood-classification-and-recommendation-system

Repository files navigation

Song Classification and Recommendation System

This project focuses on enhancing a raw dataset of songs by extracting detailed musical features, popularity metrics, and genre information using Spotify's URI data. The enriched dataset facilitates deeper analysis and the development of a song recommendation system based on various attributes of the songs.

Table of Contents

Project Overview

The project automates the process of fetching detailed song features using the Spotify API and integrates these features into the original dataset. The key steps include:

  • Initial data loading from a CSV file.
  • Parallel feature extraction from Spotify URIs.
  • Data cleaning and preprocessing.
  • Sentiment analysis on song titles.
  • Transformation and scaling of features.
  • Generation of song recommendations based on user's playlist.

Setup and Installation

# Clone the repository
git clone https://github.com/your-username/song-feature-extraction.git

# Navigate to the project directory
cd song-feature-extraction

# Install dependencies (Assuming Python 3 and pip are installed)
pip install -r requirements.txt

Data Flow Diagram

graph LR
    A[Raw Dataset] -->|Extract URIs| B(URI to Features)
    B --> C{Parallel Processing}
    C -->|Feature Extraction| D[Enriched Dataset]
    D --> E[Data Preprocessing]
    E --> F[Sentiment Analysis]
    F --> G[Feature Transformation & Scaling]
    G --> H[Feature Set Creation]
    H --> I[Generate Recommendations]
Loading

Feature Extraction

  • Utilizes uri.py script to extract features from Spotify URIs.
  • Features include musical attributes, popularity scores, and genres.
  • Implements parallel processing for efficient data handling.

Data Processing

  • Cleans and preprocesses the data for analysis.
  • Applies sentiment analysis to track names.
  • Transforms genre data using TF-IDF.
  • Scales numerical features for uniformity.

Recommendation Algorithm

  • Summarizes user playlists into a feature vector.
  • Calculates cosine similarity between playlist vector and non-playlist songs.
  • Recommends songs with the highest similarity scores.

Usage

This project is structured around a Jupyter Notebook, which provides a detailed walkthrough of the data processing, feature extraction, and recommendation generation steps. To use this notebook:

  1. Ensure you have Jupyter installed. If not, you can install it using pip:
pip install notebook
  1. Start the Jupyter Notebook server from the command line:
jupyter notebook
  1. Navigate to the project directory in the Jupyter Notebook web interface.

  2. Open the Song_Classification_and_Recommendation_System.ipynb notebook.

  3. Run the cells sequentially to perform data loading, feature extraction, data processing, and to generate recommendations.

  • Each cell in the notebook is annotated with comments to guide you through the process and explain the purpose of each code block.
  • To execute a cell, select it and press Shift + Enter.

Note:

  • Make sure all the required dependencies are installed (refer to the Setup and Installation section).
  • Ensure you have access to the Spotify API and the necessary API keys if the feature extraction script requires it.

License

This project is licensed under the Apache License - see the LICENSE.md file for details.

About

Using ML algorithms to curate a song playlist using similarity factors from a music dataset obtained from Spotify

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published