The challenge requirements were as following:
- Show basic informations on data: number of samples per class, data description, missing data, etc.
- Use three classification algorithms.
- Compare the algorithms performance in terms of accucary, precision, recall and F1 score.
- Plot a graphic that compares the implemented classifiers.
The algorithms of my choice were Support vector machine, stochastic gradient descent and random forest algorithm. As support vector machine algorithms are used for binary classification and it doesn't scale well with the dataset size, scikit-learn applies one-versus-one method as a standard. Considering that I chose to force one-versus-rest method as well.
I also had to record a video showing each step of solution and the reasoning behind it. You can find it by using this link.