Gathering air pollution data and forecasting
The purpose of this project is to make air pollution forecasts using data from sensors located in different parts of Poland. Solution is built using Amazon Web Services.
We use data from Chief Inspectorate Of Environmental Protection of Poland (http://www.gios.gov.pl/). They deliver hourly measurements from sensors spread across the country and able to measure 7 types of pollution (as for today: 21.10.2019): PM10, PM2.5, SO2, NO2, CO, C6H6, O3. The data is served in json format and can be accessed via http request.
Architecture of system responsible for downloading that data is presented below.
Once a day Cloudwatch event triggers Lambda that starts EC2 instance. The instance has its startup script that starts downloading measurement data from GIOŚ. After it finishes, the data is saved to S3 Bucket.
Forecasting time series with Facebook Prophet model.
- set up DASK cluster on AWS and run forecasts on it
- experiment with distinct time series models
- gather actual weather data and weather forecasts and use them to build features for time series modeling
- prepare forecasts visualization