The three notebooks will guide through following data science project steps:
- How to clean the data using your creativity
- How to extract data from the web
- How to use google API to calculate the trip durations between two addresses.
Data is stored in data
. final_rent_data.txt
is the data you get by web scraping some website.
You will find your notebooks in notebooks/
. In case you feel it is too complicated or you stack on some step, you can find a solution notebook in solutions/
- Clean data using your creativity. Use
cleaning_data_student_notebook
and thedata/final_rent_data.txt
(it is the data you get by webscraping), the output of your work should be a file similar tocleaned_data.txt
. - Webscrap wikipedia (use
webscraping_student_notebook
) - Use google matrix API to calculate the trip duration
Web scraping in Python: Tips and Tricks
##Disclaimer Depending on the site, webscraping could lead to you breaching the terms of service of that website. Before Webscraping, please read the robots.txt file of a website. Any code provided in the notebooks tutorials is for illustration and learning purposes only.