Skip to content

Project developed during NESS internship challenge. It uses multiclass classification dataset website phishing from Kaggle and applies fine-tune model process to Random forest algorithm reaching about 90 percent of precision, recall and F₁ score.

License

Notifications You must be signed in to change notification settings

HyanBatista/ness-internship-challenge

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NESS internship challenge

The challenge requirements were as following:

  1. Show basic informations on data: number of samples per class, data description, missing data, etc.
  2. Use three classification algorithms.
  3. Compare the algorithms performance in terms of accucary, precision, recall and F1 score.
  4. Plot a graphic that compares the implemented classifiers.

The algorithms of my choice were Support vector machine, stochastic gradient descent and random forest algorithm. As support vector machine algorithms are used for binary classification and it doesn't scale well with the dataset size, scikit-learn applies one-versus-one method as a standard. Considering that I chose to force one-versus-rest method as well.

I also had to record a video showing each step of solution and the reasoning behind it. You can find it by using this link.

About

Project developed during NESS internship challenge. It uses multiclass classification dataset website phishing from Kaggle and applies fine-tune model process to Random forest algorithm reaching about 90 percent of precision, recall and F₁ score.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published