Skip to content

Latest commit

 

History

History
21 lines (14 loc) · 865 Bytes

README.md

File metadata and controls

21 lines (14 loc) · 865 Bytes

Python Web Scraper (BeautifulSoup)

This repository contains source codes for a Web Scraping tool using Python BeautifulSoup library. These scripts were used to scrape text and video data on static and dynamic webpages of The Open Video Project website.

Directory Structure

The important files and directories of the repository

├── scrape.py : static web scraper 
├── video_scraper.py : dynamic web scraper to collect video urls
├── video_downloader.py : download video files using web urls
├── webpages :  static web pages                
    ├── example.html
├── datasets                 
    ├── example.csv
├── video_dataset
    ├── video_dataset.csv : downloaded video list           
    ├── example.mp4