Getting Started

The focus of this project is to provide an example machine Learning system and the different components. The data for this project is from the Titanic dataset, this is due to the avaiability of different modeling techniques and extensive coverage of the use case. The background information and subsequently various notebooks can be found on the following link Kaggle Titanic . The following sections og over what is required to get this project up and running. It explains the prupose behind some of th eproject structure and what is inside each of the folders.

Requirements

The applications which need to be installed as apart of the project. Python is not installed directly onto the machine you are using but utilized in the docker containers. This is primarily because I use a M1 Macbook Air with 256gb, I want to avoid having as much software installed on the system as much as possible. I push the install of python into docker so I can use multiple version without dependency issues.

Ide of your choosing
Docker
Terraform
Docker Compose

The project is broken up into seperate compenents and under each of these compents all related code and tests are included for the spceific piece of the project. For example, if the developer needs to work on the backend API, they can move over into the API DataEngine

Application Programming Interface (API)
Notebooks
Infrastructure as Code (IAC)
Data
Train
.github/workflow

##Application Programming Interface (API) ###

Below are APIs asscoiated with the project.

Prediction Engine

The API is built as a webservice to be hosted using the flask framework. The following commands in docker build the APi Container with the requirements located in the API Folder.

docker build -t titanic/api -f API/Dockerfile .

When specific packages are needed for the API, the adjusts can be made under the requirements.txt file in the API folder.

The following command will start the container in an interactive mode.

docker container run -it titanic/api

Data Engine

This API is built similiar to a CRUD

Test

In the API folder, there is a few tests have included PassengerId,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked

Notebooks

The notebooks folder is where the steps invovling explaratory analysis can occur. the commands to start up this component are listed below

To build docker image

docker build -t titanic/eda -f Notebooks/DockerFile .

to run Notebooks

docker container run -p 8888:8888 titanic/eda -v /Users/andrewbecker/Documents/Code_Projects/Titanic/Train:/Notebooks

Data

As of this moment, the data is downloaded locally and maintained by git. This could be alleviated by when the notebook container is built to pull the data from Kaggle directly instead of keeping a local coppy

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Getting Started

Requirements

Prediction Engine

Data Engine

Test

Notebooks

Data

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
.github/workflow		.github/workflow
API		API
Data		Data
IAC		IAC
Notebooks		Notebooks
readme.md		readme.md

arbecker620/Titanic

Folders and files

Latest commit

History

Repository files navigation

Getting Started

Requirements

Prediction Engine

Data Engine

Test

Notebooks

Data

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages