heo-api-parser
is a small repository that aims to create a method to extract and process heo.com catalog. The project contains a single spider that goes trough the entire website and downloads all of the items found.
git clone https://github.com/MartiONE/heo-api-parser.git
with virtualenv:
virtualenv venv
source venv/bin/activate
pip install -r requirements
scrapy crawl heo -o output.json
In order to use an authentified user on the heo platform you must provide 2 environment variables to the application which will be gathered by the scrapy spider and used to log in. Those are mail
and password
The project is on active development so any PR is welcome, I crafted a couple of things TODO but feel free to add or modify any.
- support database connection
- add section division via cli
- add support for custom sections
- add login settings for pricing and volume purposes
- include FR and DE to the data schema
- image download and configuration