Simple python script for storing tweets from the twitter stream directly to a MongoDB database based on a list of terms or users.
The script runs forever and refreshes the terms list periodically. Terms list can be modified while the scripts runs.
A catalog is created for each term in the MongoDB database.
Improvements apreciated.
git clone git://github.com/gdelfresno/twitterstream-to-mongodb.git
cd twitterstream-to-mongodb/src
python twitterstreamtomongodb.py --oauth=oauth-example.json --server=localhost --port=23717 --database=TwitterStream --dbauth=dbauth.json --track=terms-example.txt --retweets=False
:arg oauth: json file that outlines oauth credentials for Twitter developers
:arg server: default is localhost for basic/local mongodb instances
:arg port: optional port of the mongodb instance
:arg database: the name you would like the database to have
:arg dbauth: auth file with database credentials
:arg track: basic text outlining search terms such as #trending or @user_name (carriage return per entry)
:arg follow: list of users to stream (without @)
:arg retweets: specify whether or not retweets are collected and stored in the database
--track and --follow can't be used at the same time.
{
"user" : "yor_user",
"password" : "your_password"
}
Oauth Authentication:
{
"consumer_key" : "ThIsIsJuStAnExAmPlE",
"consumer_secret" : "ThIsIsJuStAnExAmPlE",
"access_token" : "ThIsIsJuStAnExAmPlE",
"access_token_secret" : "ThIsIsJuStAnExAmPlE"
}
Basic Authentication:
{
"username" : "twitter_username"
"password" : "password"
}
SomeWord
@user_name
#hashtag
pip install -r requirements.txt
https://github.com/mongodb/mongo-python-driver
pip install pymongo
git clone git://github.com/mongodb/mongo-python-driver.git pymongo
cd pymongo/
python setup.py install
https://github.com/tweepy/tweepy
pip install tweepy
Twitter Stream To MongoDB (c) by gdelfresno
Twitter Stream To MongoDB is licensed under
the terms of the GNU General Public License
as published by the Free Software Foundation.