Skip to content
This repository has been archived by the owner on Nov 10, 2024. It is now read-only.

Support for Academic Product Track? #468

Closed
nsingh23 opened this issue Jan 30, 2021 · 15 comments
Closed

Support for Academic Product Track? #468

nsingh23 opened this issue Jan 30, 2021 · 15 comments

Comments

@nsingh23
Copy link

Hi,

Im just wondering if there are any plans to support the "Academic Product Track" version that Twitter has recently released. Ive been trying to use the current rtweet version, but it throws an error. Note that the Academic Track does not have a devenv and I've set mine to NULL.

https://dev.to/twitterdev/getting-historical-tweets-using-the-full-archive-search-endpoint-1agp

"v2 full-archive search endpoint" -

image

@llrs
Copy link
Collaborator

llrs commented Jan 30, 2021

There have been a long hiatus on the maintenance of the package. But hope it can support APIv2 in the near term, a PR would speed the process.

@schochastics
Copy link

I have written a function that supports the "Academic Product Track" (gist). It is not very opinionated but does the job for now.

@nsingh23
Copy link
Author

nsingh23 commented Feb 5, 2021

Thank you so much David!. The pagination is a lifesaver!

Much appreciated!

@nsingh23 nsingh23 closed this as completed Feb 5, 2021
@JBGruber
Copy link

JBGruber commented Feb 6, 2021

Thanks @schochastics, this is a fantastic start and I can confirm it works for me!

source("https://gist.githubusercontent.com/schochastics/1ff42c0211916d73fc98ba8ad0dcb261/raw/040422b5e1378ef4c30150d4927a3991f53bc922/get_tweets.R")

df <- get_tweets(
  q = "#rstats", 
  n = 10, 
  start_time = "2010-01-01T00:00:00Z", 
  end_time = "2010-02-01T00:00:00Z",
  token = "redacted"
)

tibble::as_tibble(df$data)
#> # A tibble: 10 x 12
#>    entities$mentio… $annotations $hashtags conversation_id possibly_sensit…
#>    <list>           <list>       <list>    <chr>           <lgl>           
#>  1 <df[,3] [1 × 3]> <df[,5] [2 … <df[,3] … 8467623082      FALSE           
#>  2 <NULL>           <df[,5] [2 … <df[,3] … 8467441206      FALSE           
#>  3 <df[,3] [1 × 3]> <NULL>       <df[,3] … 8462375244      FALSE           
#>  4 <df[,3] [1 × 3]> <NULL>       <df[,3] … 8453279722      FALSE           
#>  5 <df[,3] [1 × 3]> <NULL>       <df[,3] … 8431308826      FALSE           
#>  6 <NULL>           <NULL>       <df[,3] … 8444140683      FALSE           
#>  7 <df[,3] [1 × 3]> <NULL>       <df[,3] … 8434524585      FALSE           
#>  8 <NULL>           <df[,5] [1 … <df[,3] … 8434205736      FALSE           
#>  9 <NULL>           <NULL>       <df[,3] … 8431308826      FALSE           
#> 10 <NULL>           <df[,5] [1 … <df[,3] … 8419458557      FALSE           
#> # … with 12 more variables: public_metrics$retweet_count <int>,
#> #   $reply_count <int>, $like_count <int>, $quote_count <int>, source <chr>,
#> #   text <chr>, lang <chr>, author_id <chr>, id <chr>, created_at <chr>,
#> #   in_reply_to_user_id <chr>, referenced_tweets <list>

Created on 2021-02-06 by the reprex package (v1.0.0)

However, @nsingh23, I don't think you should close this issue just yet. It would be nice to get proper support in rtweet.

@schochastics
Copy link

agreed this issue should be open, also for visibility of the workaround

@nsingh23 nsingh23 reopened this Feb 6, 2021
@nsingh23
Copy link
Author

nsingh23 commented Feb 6, 2021

reopened for long term support for the academic track and v2 in rtweet

@llrs
Copy link
Collaborator

llrs commented Feb 18, 2021

Seems like there's another improvement based on the gist here, thanks @cjbarrie

@patrickturlough
Copy link

Hi, I'm new to R and am currently doing a research project for which I need Twitter data. I already have the academic product track but am struggeling with the the code provided by @schochastics. Could somebody provide an exempel with filled in example values? Would be really appreciated !

@JBGruber
Copy link

I thought that's what I did #468 (comment). But happy to give further guidance if you tell me what you need.

Also check out the small package by @cjbarrie here. It comes with examples.

@patrickturlough
Copy link

patrickturlough commented Feb 24, 2021

I am sorry I completely misunderstood the whole thing. Function works perfectly fine !
What I don't exactly understand is how the pagination works, where it is inserted and therefore what n-value I have to choose in order to get more than 500 tweets.

@JBGruber
Copy link

JBGruber commented Feb 25, 2021

The Easiest way at the moment is:

if (!require(twitterv2r)) {
  remotes::install_github("cjbarrie/twittterv2r")
}

tweets <- twitterv2r::get_hashtag_tweets(
  q = "#rstats", 
  start_tweets = "2010-01-01T00:00:00Z", 
  end_tweets = "2010-02-01T00:00:00Z",
  data_path = "data/",
  bearer_token = "redacted"
)

The function loops through pages until nothing new is retrieved. So you don't need n. Check out the source here.

@ChristophBFH
Copy link

ChristophBFH commented Mar 3, 2021

Thank you very much for the information and the code. When I run the code with my bearer_token:

tweets <- academictwitteR::get_hashtag_tweets( q = "pflege -lang:de", start_tweets = "2018-01-01T00:00:00Z", end_tweets = "2021-03-02T00:00:00Z", data_path = "data/", bearer_token = "" )

I works perfectly and I download .json files in the defined folder per request. Unfortunately, at the end I get an error "Error: Argument 4 can't be a list containing data frames", which ends in not having the data frame "tweets". Before I manually merge the single json files to one with jsonlite package, I wondered, wether this is a known issue and if there is any fix for the function? Kind regards and once again thank you for the great work!

@JBGruber
Copy link

JBGruber commented Mar 3, 2021

Probably makes sense to report this here as you are referring to another package. I haven't experienced this. If you want to read in the json files you can try this:

files <- list.files("data/", full.names = TRUE)
df <- purrr::map_df(files, ~ {
  jsonlite::stream_in(file(.x))
  close(.x)
})

@ChristophBFH
Copy link

files <- list.files("data/", full.names = TRUE)
df <- purrr::map_df(files, ~ {
jsonlite::stream_in(file(.x))
close(.x)
})

Thank you very much and sorry for the wrong location, I am new to github. Your suggested code worked perfectly but I had to delete close(.x) for a successful run.

@hadley
Copy link
Collaborator

hadley commented Mar 4, 2021

Closing as duplicate of #445 — the academic product track is just a special case of v2.

@hadley hadley closed this as completed Mar 4, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

7 participants