Skip to content
This repository has been archived by the owner on Nov 10, 2024. It is now read-only.

Tweets with media do not show their media_url #392

Closed
jordipages-repo opened this issue Feb 10, 2020 · 2 comments
Closed

Tweets with media do not show their media_url #392

jordipages-repo opened this issue Feb 10, 2020 · 2 comments

Comments

@jordipages-repo
Copy link

jordipages-repo commented Feb 10, 2020

Problem

rtweet does not provide media_url information for most of the tweets that do contain media.

Expected behavior

I want to pull out from Twitter some tweets, but I'm only interested in their images, therefore I add to the query the has:media tag. This results in 460 tweets. However, some of these tweets might in fact be retweets, so I filter them out using filter(data, is_retweet == FALSE). This should return the actual original tweets that have some kind of media. Then I would get the media's URL checking media_url column. However, most of these tweets' columns are NA. Why is that? I have checked the tweets in twitter, and they do contain images.

Reproduce the problem

Example of code.

## I search for tweets with has:media in the query
seagrasstweet30day_media_only <- search_30day(q = 'has:media(posidonia OR poseidonia OR #posidonia OR cymodocea OR cymo OR seagrass) (Gloria OR #Gloria OR temporal OR storm OR llevantada)',
                                    n = 5000,
                                    env_name = "research")
## Which results in 460 tweets
## I filter the retweets out
tweets_with_media <- seagrasstweet30day_media_only %>% 
  filter(is_retweet == F)
## Which results in 24 tweets

## I want to get the urls of the media from all tweets
tweets_with_media %>%
  filter(!is.na(media_url)) %>% 
  select(media_url) %>% 
  unnest()
## And I only get media_url information for 3 tweets instead of 24... why?!

rtweet version

## copy/paste output
packageVersion("rtweet")
## ‘0.7.0’

Session info

## copy/paste output
sessionInfo()

R version 3.6.0 (2019-04-26)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS 10.15.3

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
1] rtweet_0.7.0

loaded via a namespace (and not attached):
1] httr_1.4.1 compiler_3.6.0 magrittr_1.5 R6_2.4.0 tools_3.6.0 jsonlite_1.6

Token

## copy/paste output
rtweet::get_token()
request: https://api.twitter.com/oauth/request_token authorize: https://api.twitter.com/oauth/authenticate access: https://api.twitter.com/oauth/access_token RtweetAppJordi key: wfzD9SwRkJD2EqdSmKUl3vv1l secret: oauth_token, oauth_token_secret, user_id, screen_name
@jordipages-repo jordipages-repo changed the title Tweets with media, do not show their media_url Tweets with media do not show their media_url Feb 10, 2020
@llrs llrs mentioned this issue Feb 15, 2021
@hadley
Copy link
Collaborator

hadley commented Mar 4, 2021

The closest free equivalent of your search looks ok:

library(rtweet)
library(dplyr, warn.conflicts = FALSE)

tw <- search_tweets("filter:links")
tw %>% 
  filter(!is_retweet) %>% 
  select(media_url)
#> # A tibble: 19 x 1
#>    media_url
#>    <list>   
#>  1 <chr [1]>
#>  2 <chr [1]>
#>  3 <chr [1]>
#>  4 <chr [1]>
#>  5 <chr [1]>
#>  6 <chr [1]>
#>  7 <chr [1]>
#>  8 <chr [1]>
#>  9 <chr [1]>
#> 10 <chr [1]>
#> 11 <chr [1]>
#> 12 <chr [1]>
#> 13 <chr [1]>
#> 14 <chr [1]>
#> 15 <chr [1]>
#> 16 <chr [1]>
#> 17 <chr [1]>
#> 18 <chr [1]>
#> 19 <chr [1]>

Created on 2021-03-04 by the reprex package (v1.0.0)

I suspect the problem is with your search term — I couldn't find much mention of has:media, but I suspect your query should look more like this:

has:media (posidonia OR poseidonia OR #posidonia OR cymodocea OR cymo OR seagrass OR Gloria OR #Gloria OR temporal OR storm OR llevantada)

@hadley hadley closed this as completed Mar 4, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants