Skip to content
This repository has been archived by the owner on Nov 10, 2024. It is now read-only.

compatibility with twitter premium API: make tweets per requests dynamic? #317

Closed
TomasZwinkels opened this issue Mar 4, 2019 · 11 comments

Comments

@TomasZwinkels
Copy link

TomasZwinkels commented Mar 4, 2019

Wish / problem
For a new research project at the university of Basel we would like to use the rtweet package. We need to use the Historical search functionality that comes new premium twitter API. We also need it because of the volume of tweets we need to downloaded.

When investigating the compatibility of rtweet with the additional functionality that is offered as part of the twitter premium AIP I ran into following issue (I think):

Sub issue one: limit of 200 tweets per requests currently hard-coded in

expected issue
the limitation of 200 tweets per requests is hard coded into the script at a couple of points, to use the premium API it should be possible to for example do a 'get_timeline' request where the function 'get_timeline_call' divides the call in blocks of either 100 or 500 tweets, see https://developer.twitter.com/en/pricing/search-fullarchive. I am not sure but maybe the API has an option that returns the max tweets per request for the current user (I did have a look at this already but have not managed to find it).

When the user has a paid premium API she will burn through her requests unnecessarily fast because tweets are downloaded with 200 per requests while this could be 500.

Desired behavior
ideally: rtweet detects what the tweets per request limit is and adjust its division of request accordingly
other option: allow for a user wide setting (optional argument for the 'create token function?) that allows the user to specify how many tweets per request she has.

Reproduce the problem
I have not yet requested access to the premium API (budget considerations). Yet, I think that when get_timeline request is run with a request for more then 100 tweets on a premium API sandbox that it won't run because the limit for tweets by request is then reached.

Context
user / twitter consumer has premium API sandbox

Code

get_timeline("realDonaldTrump", n = 100) ' probably works
get_timeline("realDonaldTrump", n = 101) ' probably does not work

related issues (in both cases premium API access was not (yet) an option)
https://github.com/mkearney/rtweet/issues/228
https://github.com/mkearney/rtweet/issues/278

There might be other issues with regards to compatibility with the premium twitter API that I am not aware off.

@NataliaUC
Copy link

Hi!
Have you been able to test this with the premium API already? I am also thinking about upgrading my API and I'm worried that due to rtweet's settings I won't get the desired results.

@TomasZwinkels
Copy link
Author

Hi there! We have unfortionately not managed to make any progress ourselves on this yet. I have tried a bit but am running into my own limited experience with both the twitter API and r-package editing.

@NataliaUC
Copy link

Hi! thanks for your response!
I learnt the past few days that it is possible to iterate between multiple REST APIs with the rtweet package, which would reduce the waiting time and wouldn't involve any costs. However, I am still trying to figure out how to get the APIs to iterate
Maybe this helps: https://github.com/mkearney/rtweet/issues/51

@sentenza3000
Copy link

sentenza3000 commented Aug 1, 2019

I am using the premium API and it is as described above by @TomasZwinkels, that rtweet unneccesarily "burns" requests. When using the search_fullarchive() function, each single request with n=500 counts on my paid total requests as if it was 5 requests, although the tweet limit per request is 500 in the paid version. Thus, when using the existing function in rtweet, purchased requests are used up far too quick. Unfortunately, I do not have the skill to fix this myself. I am not sure which part of the code is causing search_fullarchive() to behave like this. I think it has to do with what is called "Pagination" within the Twitter API (LINK).

@sentenza3000
Copy link

@kevintaylor solved the issue here:
#368 (comment)

It is not implemented in the rtweet package so far. Perhaps @mkearney as author would want to update it. Unfortunately this is beyond my skill level.

@kevintaylor
Copy link
Contributor

The fork with the fix is here, if someone needs to use the fix and doesn't want to clone and edit the search_tweets.R code themselves:

https://github.com/kevintaylor/rtweet

@jeremy-allen
Copy link

The fork with the fix is here, if someone needs to use the fix and doesn't want to clone and edit the search_tweets.R code themselves:

https://github.com/kevintaylor/rtweet

@kevintaylor If I install your fork how do I distinguish your rtweet package from Mike's when using library()? And thanks for the fix!

@kevintaylor
Copy link
Contributor

When you install it from your local machine you'll see both packages and should choose which one you want to use. This was a one-line change so hopefully it can get pulled into the main branch.

@jeremy-allen
Copy link

When you install it from your local machine you'll see both packages and should choose which one you want to use. This was a one-line change so hopefully it can get pulled into the main branch.

@kevintaylor Install of your clone was successful. Now when I list all files in my library there is only 1 rtweet folder, so I assume yours overwrote Mike's. And when I cat the DESCRIPTION file in that folder it has your name in it. So I guess the original install of Mike's is gone. Was there a way I could have installed yours without overwriting his?

@HohnerJulian
Copy link

Hi all,

first of all, thank you Kevin for your fix!

I just tried your fork and it seems like (at least for me) that it changed my amount of tweets per request from 50 to 200: I did a test-scrape with n=500 and it sent 3 requests to the API. Compared to the 50 tweets per request beforehand, this is far better. But since my bought package would allow me to scrape 500 per request, I don't get the amount of data I could. Any idea why?

For explanation what I did: I deleted the my current version of rtweet and installed your clone via remotes::install_github("kevintaylor/rtweet") and did the library(rtweet). Also, I'm looking for a specific hashtag, so there's no specific user_timeline that would limit my tweets per request.

Thank you in advance!! :)

Julian

@llrs
Copy link
Collaborator

llrs commented Feb 15, 2021

I close this hoping to merge soon #375. Perhaps the API rate limit changed?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

7 participants