Skip to content

Latest commit

 

History

History
40 lines (28 loc) · 1.15 KB

README.md

File metadata and controls

40 lines (28 loc) · 1.15 KB

Introduction

heo-api-parser is a small repository that aims to create a method to extract and process heo.com catalog. The project contains a single spider that goes trough the entire website and downloads all of the items found.

How to use it

Clone the repo:

git clone https://github.com/MartiONE/heo-api-parser.git

Install the required dependencies

with virtualenv:

virtualenv venv
source venv/bin/activate
pip install -r requirements

Run the spider

scrapy crawl heo -o output.json

Authentification

In order to use an authentified user on the heo platform you must provide 2 environment variables to the application which will be gathered by the scrapy spider and used to log in. Those are mail and password

Development

The project is on active development so any PR is welcome, I crafted a couple of things TODO but feel free to add or modify any.

  • support database connection
  • add section division via cli
  • add support for custom sections
  • add login settings for pricing and volume purposes
  • include FR and DE to the data schema
  • image download and configuration