Skip to content

Latest commit

 

History

History
129 lines (112 loc) · 2.4 KB

README.md

File metadata and controls

129 lines (112 loc) · 2.4 KB

Titanic-Machine-Learning

Applying different machine learning algorithms on the famous Titanic dataset

Source of the dataset: https://www.kaggle.com/c/titanic/data

The purpose of this repository to demonstrate different classification algorithms on the same dataset. Since it is a well-known dataset I did not made any exploratory data analysis. Different notebooks will be add in the future.

Let's have a look at the dataset.

Pass.Id Surv. Pclass Name Sex Age SibSp Parch Ticket Fare Cabin Emb.
1 0 3 Braund, Mr. Owen Harris male 22.0 1 0 A/5 21171 7.2500 NaN S
2 1 1 Cumings, Mrs. John Bradley (Florence Briggs Th... female 38.0 1 0 PC 17599 71.2833 C85 C
3 1 3 Heikkinen, Miss. Laina female 26.0 0 0 STON/O2. 3101282 7.9250 NaN S
4 1 1 Futrelle, Mrs. Jacques Heath (Lily May Peel) female 35.0 1 0 113803 53.1000 C123 S
5 0 3 Allen, Mr. William Henry male 35.0 0 0 373450 8.0500 NaN S

'

Variable Notes (Source: Kaggle)

pclass: A proxy for socio-economic status (SES) 1st = Upper 2nd = Middle 3rd = Lower

age: Age is fractional if less than 1. If the age is estimated, is it in the form of xx.5

sibsp: The dataset defines family relations in this way... Sibling = brother, sister, stepbrother, stepsister Spouse = husband, wife (mistresses and fiancés were ignored)

parch: The dataset defines family relations in this way... Parent = mother, father Child = daughter, son, stepdaughter, stepson Some children travelled only with a nanny, therefore parch=0 for them.

History:

2019.12.22.: Random forest classification : Titanic_random_forest_classification.ipynb

2019.12.26.: K-nearest neighbors classification: Titanic_K-nearest_neighbors_classification.ipynb