Welcome to the Railway Tracker! Train data is extracted via an ETL pipeline from National Rail and Realtime Train, which is then used to create three different outputs. These outputs will provide various information on a station's cancellations, delays, and departures/arrivals. Furthermore, any incidents will send an alert notification to all subscribed users. The code for these outputs are in the folder: dashboard, national_rail, and pdf_report. This results in our tracker which is provided to consumers as a service.
To learn more about our project, check out our presentation!
Here is what our dashboard looks like!
Railway Tracker Dashboard Demo- Two full data pipeline hosted in the cloud.
- Automatic alerts for delays/cancellations for certain train operators.
- Regular PDF summary reports directly emailed to subscribers.
- Visualisation of stations' data on the dashboard.
This diagram explicitly shows how the data extracted from the API is stored in the database. We have decided to use a 3NF RD for our database due to the requirements of the tracker.
This is a diagram that shows how every different AWS services work and interact with each other in order to complete the requirements of the tracker.
Information regarding each section of the AD will not be found here. Instead, clicking on the bullet points below will send you to the relative README located in the directories of the section that you would like to look at. There you can find out all the information that you need.
The following languages/softwares are required for this project. Things assigned as optional are only required if you desire to host this tracker on the cloud.
- Python
- Bash
- Docker (Optional)
There are various directories for each part of the project. In order to run the scripts in each folder, you will need to install the required libraries. This can be done using the code provided below:
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
All the scripts only require basic commands to be executed. Different commands are used depending on the software. Ensure that you are in the directory of the file you want to run before using these commands.
# Python
python3 "example_script.py"
# Bash
bash "example_script.sh"
# Terraform
terraform init
terraform apply
yes
# Docker
docker build -t "example_image"
docker run --env-file .env -t "example_image:tag"
One thing to note is that the majority of scripts use environment variables. Make sure to create your own .env and fill out all the variables required to run the code successfully. Smaller READMEs will be found in each folder going into more depth on how to use that specific part of the repository.
There are several directories within the repository to maintain an organised project. Each directory will be labelled below with a brief sentence explaining what they do.
Clicking on the link provided will redirect you to the README in that directory which provides more information
archive
. - Contains scripts that archive month-old data in the database to avoid over-inflation of data storage.dashboard
- Contains all the scripts involved in creating and hosting the dashboard.database
. - Contains the schema used to create the RDS hosted on AWS as well as any auxiliary functions.diagrams
- Only contains the .png files of the diagrams used in this README.national_rail
.- Contains all scripts involved in creating the ETL pipeline for tracking incidents as well as scripts required to send emails/sms to subscribers to notify them when an incident has occurred.pdf_report
- Contains all the scripts required to send a daily email to registered subscribers which contains the PDF on the data from yesterday.realtime_trains
- Contains all the scripts involved in creating the ETL pipeline for import stations data to the database.terraform
- Contains the main terraform script used to host the tracker on the cloud.
For this tracker, we have designed it with the intention of hosting everything on the cloud in order to automate it. The python scripts can still be ran locally but the terraform scripts have been included within the repository if you desire to host this service on the cloud as well. The cloud service that has been used is AWS.
Common issues which people face are:
-
Not setting up environment variables correctly - you need to ensure that you have done the following:
- Create .env file for python scripts or a terraform.tfvars for terraform scripts
- Create the file in the same directory as the file you want to run
- Make sure the variable names in your file are the same as the ones used in the script that you would like to run.
-
Not downloading the required libraries before running the script.
- Go to the directory of the script that you want to run.
- Create a .venv and activate it.
- Install the requirements onto your .venv.
-
For any other problems, make sure to reach out and contact us so we can support you further.
Name | Github Profile |
---|---|
Fariha Choudhury | FarihaChoudhury |
Faris Abulula | A-Faris |
Freddy Martinez Rosero | fm1psy |
Zhi Loh | Zhi-704 |