Skip to content

Latest commit

 

History

History
23 lines (18 loc) · 1.37 KB

File metadata and controls

23 lines (18 loc) · 1.37 KB

Workshop on rental prices in Berlin.

The three notebooks will guide through following data science project steps:

  1. How to clean the data using your creativity
  2. How to extract data from the web
  3. How to use google API to calculate the trip durations between two addresses.

Data

Data is stored in data. final_rent_data.txtis the data you get by web scraping some website.

Notebooks

You will find your notebooks in notebooks/. In case you feel it is too complicated or you stack on some step, you can find a solution notebook in solutions/

Work

  1. Clean data using your creativity. Use cleaning_data_student_notebook and the data/final_rent_data.txt (it is the data you get by webscraping), the output of your work should be a file similar to cleaned_data.txt.
  2. Webscrap wikipedia (use webscraping_student_notebook)
  3. Use google matrix API to calculate the trip duration

Relevant blogposts:

Web scraping in Python: Tips and Tricks

##Disclaimer Depending on the site, webscraping could lead to you breaching the terms of service of that website. Before Webscraping, please read the robots.txt file of a website. Any code provided in the notebooks tutorials is for illustration and learning purposes only.