get_friends / get_follows consistency #308

alexpghayes · 2019-01-30T18:40:33Z

get_friends() returns a tibble with two columns, but get_followers() only returns a tibble with a single column. It would be nice if they both returned the same columns. Always returning the two columns user and user_id would make it easy to build graphs from edge lists.

The text was updated successfully, but these errors were encountered:

alexpghayes · 2021-03-14T21:46:28Z

Side note: if there's going to be a breaking change here, I would love if we could get explicit from and to column names rather than user and user_id.

Arf9999 · 2021-03-15T06:10:29Z

Not to be difficult, but to avoid some breaking changes, it would be better not to change any column names. If some reference is required, rather add columns such as "follower_of" or "friend_of" which provide reference to a particular user_id for network mapping if required. Converting to edgelists is then fairly simple and there is no breaking change in terms of referencing the user_id column for other functions.

(Speaking very selfishly... I fear that this would break a few of my existing analysis functions... and I'd prefer not to try figure out what I was doing two years ago to get them to work again).

alexpghayes · 2021-07-15T13:43:40Z

On top of this it seems like get_friends() only returns a single column when the requested user doesn't follow anyone.

rtweet::get_friends("hadleywickham")
#> # A tibble: 274 x 2
#>    user          ids                
#>    <chr>         <chr>              
#>  1 hadleywickham 1344499345773752321
#>  2 hadleywickham 3959153969         
#>  3 hadleywickham 43875304           
#>  4 hadleywickham 2423861950         
#>  5 hadleywickham 793171723772395521 
#>  6 hadleywickham 1218534623959044096
#>  7 hadleywickham 2798914670         
#>  8 hadleywickham 319822761          
#>  9 hadleywickham 43470348           
#> 10 hadleywickham 326511843          
#> # … with 264 more rows

rtweet::get_friends("SmallBuStudio")
#> # A tibble: 1 x 1
#>   user         
#>   <chr>        
#> 1 SmallBuStudio

^{Created on 2021-07-15 by the reprex package (v2.0.0)}

llrs · 2021-11-20T09:38:56Z

I like the idea of making them more consistent and it is an easy change, but I don't like to add new columns with the same information.

It might be worth adding to get_followers a first column with the user as on get_friends, that way the second column will always be the to user and the first one the from. The column names can be confusing and people that might not pay attention might think that the user_id is the id of the user on the first column, so a name change might be useful.

For the moment when a user has no friends the output will be a 0 x 2 tibble, for easy rbind.

For reference on 0.7.0 they returned this:

> rtweet::get_friends("SmallBuStudio")
## A tibble: 0 × 0
> rtweet::get_friends("hadleywickham")
## A tibble: 279 × 2
#    user          user_id            
#    <chr>         <chr>              
#  1 hadleywickham 1215516763024003074
#  2 hadleywickham 34677653           
#  3 hadleywickham 911618422483640320 
#  4 hadleywickham 877452117581213696 
#  5 hadleywickham 235261861          
#  6 hadleywickham 935427373620658176 
#  7 hadleywickham 13202482           
#  8 hadleywickham 1344499345773752321
#  9 hadleywickham 3959153969         
# 10 hadleywickham 43875304           
## … with 269 more rows
> rtweet::get_followers("hadleywickham")
## A tibble: 5,000 × 1
#   user_id            
#   <chr>              
# 1 1461812126457253895
# 2 1460159073039618048
# 3 1253866294358585348
# 4 552589054          
# 5 1302715713052794880
# 6 1268043412239835138
# 7 452215809          
# 8 831911196840300547 
# 9 1461893465885741063
#10 3181675553         
## … with 4,990 more rows

alexpghayes · 2021-11-21T20:11:38Z

Column name wise could also go with something like from_id and to_id for more clarity.

alexpghayes · 2021-11-23T23:40:46Z

So happy to see this happen 🎉 !

llrs · 2021-11-23T23:48:42Z

I went with your suggestion because it made it clear which account is following which. Glad to close an old issue 😃 .

alexpghayes · 2021-11-24T02:18:20Z

Upon further thought I don't think the _id suffix is a good idea. This is because sometimes you get a screen name and sometimes you get a user ID and the _id suffix makes it seem like you always get a user ID. For example, this is confusing here, and uses the _id suffix in a way that is not consistent with the v2 API.

rtweet::get_friends("hadleywickham")
#> # A tibble: 279 × 2
#>    from_id       to_id              
#>    <chr>         <chr>              
#>  1 hadleywickham 1215516763024003074
#>  2 hadleywickham 34677653           
#>  3 hadleywickham 911618422483640320 
#>  4 hadleywickham 877452117581213696 
#>  5 hadleywickham 235261861          
#>  6 hadleywickham 935427373620658176 
#>  7 hadleywickham 13202482           
#>  8 hadleywickham 1344499345773752321
#>  9 hadleywickham 3959153969         
#> 10 hadleywickham 43875304           
#> # … with 269 more rows

^{Created on 2021-11-23 by the reprex package (v2.0.1.9000)}

llrs · 2021-11-24T08:26:43Z

Yes, I briefly considered changing to from_user and to_user which is neither screen_name or id.

I see that v2 returns id, name and screen name. I don't think there is any way to make the result consistent between API versions, but if you have an idea you can send a PR and I will review it to get it merged.

Arf9999 · 2021-11-24T08:37:41Z

For the tibble returned for this request, the ideal (for me) would be some kind of naming consistency for the IDs returned as followers/friends, and a column indicating the account friended/followed. So, I'd prefer if the queried account column be called 'follower_of' or 'friend_of' and the returned friends/followers as 'user_id' (or whatever the account ID is named in other rtweet tibbles returned by searches or lookups) That way we only add one new naming convention per query rather than two. I hope my explanation is clear.

…

________________________________ From: Lluís ***@***.***> Sent: Wednesday, November 24, 2021 10:26:55 AM To: ropensci/rtweet ***@***.***> Cc: Andrew Fraser ***@***.***>; Comment ***@***.***> Subject: Re: [ropensci/rtweet] get_friends / get_follows consistency (#308) Yes, I briefly considered changing to from_user and to_user which is neither screen_name or id. I see that v2 returns<https://developer.twitter.com/en/docs/twitter-api/users/follows/api-reference/get-users-id-followers> id, name and screen name. I don't think there is any way to make the result consistent between API versions, but if you have an idea you can send a PR and I will review it to get it merged. — You are receiving this because you commented. Reply to this email directly, view it on GitHub<#308 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AIY5MIJCKFVJQ5ZB76VNBHLUNSOU7ANCNFSM4GTLIS2A>. Triage notifications on the go with GitHub Mobile for iOS<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

llrs · 2021-11-24T08:49:22Z

@Arf9999 I am not sure I understand what kind of consistency you want or why. Currently it is consistent between followers and friends, they have the same column name. Do you want different name between get_followers and get_friends to be able to differentiate where does the data come from?

I don't follow up your naming convention per query comment, as I wrote the API on v2 will return different fields (column names) which are not possible to obtain now (without more API calls), and the current response of API v1 on friends and followers is not consistent with the other endpoints, so we might as well chose which names are used on get_friends and get_followers.

Arf9999 · 2021-11-24T09:13:17Z

Current (0.7) rtweet returns two columns for get_friends. 'user' and 'user_id' and a single column for get_followers, 'user_id'. So, although the structure between the two is inconsistent, the key column is similarly named. I'd prefer to keep that consistency. An additional column for context can be named accordingly, but the key column should be named consistently across tibbles returned by the package, I believe, as it allows for simpler joins between different query responses. What that consistent name should be is negotiable based on API return from V1 or V2 queries. I'm happy with current (0.7) naming because I have existing functions that use it, and changing it will break those functions. But that's just my selfish view.

…

________________________________ From: Lluís ***@***.***> Sent: Wednesday, November 24, 2021 10:49:33 AM To: ropensci/rtweet ***@***.***> Cc: Andrew Fraser ***@***.***>; Mention ***@***.***> Subject: Re: [ropensci/rtweet] get_friends / get_follows consistency (#308) @Arf9999<https://github.com/Arf9999> I am not sure I understand what kind of consistency you want or why. Currently it is consistent between followers and friends, they have the same column name. Do you want different name between get_followers and get_friends to be able to differentiate where does the data come from? I don't follow up your naming convention per query comment, as I wrote the API on v2 will return different fields (column names) which are not possible to obtain now (without more API calls), and the current response of API v1 on friends and followers is not consistent with the other endpoints, so we might as well chose which names are used on get_friends and get_followers. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub<#308 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AIY5MIIZ2UB2GE3AYH4JBSDUNSRJ3ANCNFSM4GTLIS2A>. Triage notifications on the go with GitHub Mobile for iOS<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

llrs · 2021-11-24T16:48:29Z

To summarize, you want to have a column that is equally named on both get_friends and get_followers even if the "direction" of the relationship is not the same, while Alex wants both to have the same column names.

@Arf9999 What are those simpler joins you want to keep and how hard are they currently as on 0.7.0.9008?
I don't see a simple way to reconcile these two ideas, perhaps you could send a PR with what you have in mind?

Arf9999 · 2021-11-25T09:32:26Z

To summarize, you want to have a column that is equally named on both get_friends and get_followers even if the "direction" of the relationship is not the same, while Alex wants both to have the same column names.

@Arf9999 What are those simpler joins you want to keep and how hard are they currently as on 0.7.0.9008? I don't see a simple way to reconcile these two ideas, perhaps you could send a PR with what you have in mind?

I think Alex wants a simple way to build networks, while I want to use these tibbles to examine things like the Jaccard Index between accounts and also to easily join these tibbles to the responses from a (for example) search_tweets or lookup_user query.

The joins are very easy in 0.7.0 because the key column is the same in both i.e, 'user_id', so using a dplyr join simply requires 'by = user_id' consistently across almost all rtweet tibbles.

briatte mentioned this issue Jun 26, 2019

Submission: rtweet ropensci/software-review#302

Closed

25 tasks

llrs mentioned this issue Feb 15, 2021

Update roadmap #471

Closed

llrs added documentation enhancement labels Feb 15, 2021

llrs closed this as completed in 1c6025b Nov 23, 2021

llrs mentioned this issue Nov 27, 2021

Adjust naming convention for follower and friend columns #636

Closed

llrs mentioned this issue May 6, 2022

get_friends() and get_followers() return objects with different column names. #594

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

get_friends / get_follows consistency #308

get_friends / get_follows consistency #308

alexpghayes commented Jan 30, 2019

alexpghayes commented Mar 14, 2021

Arf9999 commented Mar 15, 2021 •

edited

Loading

alexpghayes commented Jul 15, 2021

llrs commented Nov 20, 2021

alexpghayes commented Nov 21, 2021

alexpghayes commented Nov 23, 2021

llrs commented Nov 23, 2021

alexpghayes commented Nov 24, 2021

llrs commented Nov 24, 2021

Arf9999 commented Nov 24, 2021 via email

llrs commented Nov 24, 2021

Arf9999 commented Nov 24, 2021 via email

llrs commented Nov 24, 2021

Arf9999 commented Nov 25, 2021

get_friends / get_follows consistency #308

get_friends / get_follows consistency #308

Comments

alexpghayes commented Jan 30, 2019

alexpghayes commented Mar 14, 2021

Arf9999 commented Mar 15, 2021 • edited Loading

alexpghayes commented Jul 15, 2021

llrs commented Nov 20, 2021

alexpghayes commented Nov 21, 2021

alexpghayes commented Nov 23, 2021

llrs commented Nov 23, 2021

alexpghayes commented Nov 24, 2021

llrs commented Nov 24, 2021

Arf9999 commented Nov 24, 2021 via email

llrs commented Nov 24, 2021

Arf9999 commented Nov 24, 2021 via email

llrs commented Nov 24, 2021

Arf9999 commented Nov 25, 2021

Arf9999 commented Mar 15, 2021 •

edited

Loading