Skip to content

Latest commit

 

History

History
31 lines (20 loc) · 907 Bytes

README.md

File metadata and controls

31 lines (20 loc) · 907 Bytes

Wenshu spider

TODO

    Add IP Proxies
    Validation Code
    Control flow refactor

Requirements

pip install -r requirements.txt Besides, you need to install Nodejs and make your you have command `node' in the path

In case the spider meets validation code, you need to install tesseract and tesseract-ocr plus some python wrappers. Note that this is a to-do functionality.

Pre Run

Please change some settings in judgement_spider/settings.py before you run the script

Run

python runner.py

Deployment

Deployment via Docker is recommended. Please specify 3 folders and mount them to /home/stack/judgement_logs,/home/stack/judgement_docs and /home/stack/judgement_mongo_db respectively.

Besides, map host port 27017 to container port 27017.

Problems

    502,503 response

Note

Please be friendly to the website!!!