Skip to content

davechallis/archive-twitter-search

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 

Repository files navigation

Copyright 2010 Dave Challis <[email protected]>

1 Summary
=========

archive_twitter_search.py is a short script which performs a twitter search
using twitter's search API, and saves each tweet (including all metadata) to a
SQLite database.

It's designed to require almost no configuration (apart from specifying a
search term!).

It *only* works using twitter's search API, i.e. it will not get/archive an
individual's statuses (getting these uses twitter's REST API, which is
tightly rate limited).


2 Getting Started
=================

2.1 Archiving tweets from twitter search terms
==============================================

1. Edit 'archive_twitter_search.py' and change the 'query' variable
2. Run the script!

Assuming all goes well, you should see a message about the number of tweets
added, and a SQLite database file named 'tweets.db' will have been created.

Ideally you should add a cron job to run the script every N minutes (where N
depends on how often new tweets for your search term(s) are likely to turn
up).

Only tweets added since your last search will be searched for, so queries
after the inital one should be a lot faster.


2.2 Viewing saved tweets
========================

1. Open the database in SQLite:
    sqlite3 tweets.db

2. View the database schema by entering:
    .schema

3. Get any/all tweets using an SQL query, e.g.:
    SELECT * FROM tweets LIMIT 5;

Pretty much every language has SQLite bindings these days, so you should be
able to open and access the archived tweets from there.


3 System Requirements
=====================

Software:
* python 2.6+ (hasn't been tested with earlier releases, but might work)
* sqlite3

Non-standard Python libraries:
* dateutil
* sqlalchemy


4 About twitter's search API
============================

Full details on the twitter search API at:
    http://apiwiki.twitter.com/Twitter-API-Documentation

Twitter's search API allows the last 1500 results for a search term to be
queried for.  In order to capture *all* tweets for a given search term, you'll
need to experiment with running it often enough that you get <1500 new tweets
each time (this may not be possible for trending items!).

While the search API is rate limited, limits for the search API are apparently
much more generous than the 150 requests per hour imposed on the REST API.

Rate limiting details:
    http://apiwiki.twitter.com/Rate-limiting

About

Short script to archive twitter searches

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages