GitHub - rrschmidt/scrapy_tdd: Helpers and examples to build Scrapy Crawlers in a test driven way.

scrapy_tdd

Helpers and examples to build Scrapy Crawlers in a test driven way.

Motivation / Why should I develop Scrapy Crawlers using TDD?

The develop - test cycle goes down to a few seconds and so it allows you to get a properly working scraper up much faster
When bugs are discovered in "the wild" with real data, new example files, a test and a fix can be created and tested much faster
It allows for fast refactoring without breaking anything - which results in much cleaner scraper code
It just feels right when you are used to be doing TDD

What's the difference to Scrapy's Spiders Contracts?

Scrapy has its own builtin testing feature named Spiders Contracts

I tried to use them for some time, but decided to build real unit tests in a unit test framework like py.test because of these shortcomings:

its philosophy is geared towards testing against contracts (thus the name) that by nature are more broad and less specific concepts. Testing for exact field contents in items can be done, but is difficult and fragile
its documentation and basic set of features is a bit thin
it mixes implementation code with contract descriptions which is only usable when there are few and simple contracts

Installation

pip install scrapy_tdd

Quick Start Examples

def describe_fancy_spider():

to_test = MySpider().from_crawler(get_crawler())

def describe_parse_suggested_terms():

resp = response_from("Result_JSON_Widget.txt") results = to_test.parse(resp)

def should_get_item():

item = results assert item[0]["lorem"] == 'ipsum' assert item[0]["iterem"] == "ipsem"

Full Documentation

... coming soon ...

Missing / next steps

Mocking Request-Response pairs

How to contribute

... coming soon ...

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
scrapy_tdd		scrapy_tdd
test		test
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.rst		README.rst
__init__.py		__init__.py
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py
sonar-project.properties		sonar-project.properties

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

scrapy_tdd

Motivation / Why should I develop Scrapy Crawlers using TDD?

What's the difference to Scrapy's Spiders Contracts?

Installation

Quick Start Examples

Full Documentation

Missing / next steps

How to contribute

About

Releases

Packages

Languages

License

rrschmidt/scrapy_tdd

Folders and files

Latest commit

History

Repository files navigation

scrapy_tdd

Motivation / Why should I develop Scrapy Crawlers using TDD?

What's the difference to Scrapy's Spiders Contracts?

Installation

Quick Start Examples

Full Documentation

Missing / next steps

How to contribute

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages