You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Nov 10, 2024. It is now read-only.
rtweet does not provide media_url information for most of the tweets that do contain media.
Expected behavior
I want to pull out from Twitter some tweets, but I'm only interested in their images, therefore I add to the query the has:media tag. This results in 460 tweets. However, some of these tweets might in fact be retweets, so I filter them out using filter(data, is_retweet == FALSE). This should return the actual original tweets that have some kind of media. Then I would get the media's URL checking media_url column. However, most of these tweets' columns are NA. Why is that? I have checked the tweets in twitter, and they do contain images.
Reproduce the problem
Example of code.
## I search for tweets with has:media in the queryseagrasstweet30day_media_only<- search_30day(q='has:media(posidonia OR poseidonia OR #posidonia OR cymodocea OR cymo OR seagrass) (Gloria OR #Gloria OR temporal OR storm OR llevantada)',
n=5000,
env_name="research")
## Which results in 460 tweets## I filter the retweets outtweets_with_media<-seagrasstweet30day_media_only %>%
filter(is_retweet==F)
## Which results in 24 tweets## I want to get the urls of the media from all tweetstweets_with_media %>%
filter(!is.na(media_url)) %>%
select(media_url) %>%
unnest()
## And I only get media_url information for 3 tweets instead of 24... why?!
Problem
rtweet does not provide media_url information for most of the tweets that do contain media.
Expected behavior
I want to pull out from Twitter some tweets, but I'm only interested in their images, therefore I add to the query the
has:media
tag. This results in 460 tweets. However, some of these tweets might in fact be retweets, so I filter them out usingfilter(data, is_retweet == FALSE)
. This should return the actual original tweets that have some kind of media. Then I would get the media's URL checking media_url column. However, most of these tweets' columns are NA. Why is that? I have checked the tweets in twitter, and they do contain images.Reproduce the problem
Example of code.
rtweet version
Session info
## copy/paste output sessionInfo()
R version 3.6.0 (2019-04-26)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS 10.15.3
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib
locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
1] rtweet_0.7.0
loaded via a namespace (and not attached):
1] httr_1.4.1 compiler_3.6.0 magrittr_1.5 R6_2.4.0 tools_3.6.0 jsonlite_1.6
Token
The text was updated successfully, but these errors were encountered: