Skip to content

Building a movies recommendation system based on genres column in the kaggle TMDB data set

Notifications You must be signed in to change notification settings

hassansahhin/Genres-Analysis---EDA-Clustring

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

Genres-Analysis---EDA-Clustring

TMDB data set

  • Kaggle data set The movies data base contain over 10,000 movies contain sevral information about New columns:

    homepage id original_title overview popularity production_companies production_countries release_date spoken_languages status tagline vote_average

  • the CSV files could be found Here

Overview

  • The notebook was build genreally on the genres column to create a well polished EDA graphs using matplotlib package
    import matplotlib.pyplot as plt

  • The secound task is to create a movie recommendation system based on the movies generes using sklearn Kmean algorithm
    from sklearn.cluster import KMeans

Contribution

You are most wellcome to fork my notebook and update my code , below some inspiration points could be worked on :

  • Can you categorize the films by type, such as animated or not? We don't have explicit labels for this, but it should be possible to build them from the crew's job titles.
  • How sharp is the divide between major film studios and the independents? Do those two groups fall naturally out of a clustering analysis or is something more complicated going on?

Original notebook

The original notebook was build on kaggel karnel please visit my notebook Here and upvote if you found some thing useful.

About

Building a movies recommendation system based on genres column in the kaggle TMDB data set

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published