This project involves building and evaluating machine learning models to predict whether an individual's income exceeds $50,000 per year based on the Census Income dataset. The models used in this project include Decision Trees and Random Forest Classifier.
The objective of this project is to predict whether an individual earns more than $50,000 per year using classification techniques. Two models are trained and compared: Decision Trees and Random Forest Classifier. This project serves as an example of handling imbalanced datasets and using ensemble methods for classification.
The dataset used is the Census Income Data (also known as "Adult" dataset), which can be found on the UCI Machine Learning Repository here. The dataset includes features such as:
- age
- workclass
- education
- marital-status
- occupation
- relationship
- race
- sex
To run this project, you need to have Python installed along with the necessary libraries. You can install the required libraries using the following command:
pip install -r requirements.txt