You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The fct_lump_ functions are all about deciding which levels to keep or lump together based on their frequency, so I think that they should also have an option (e.g. sort = c("no", "asc", "desc")) to return those levels in an order that is based on their frequency.
Edit: Or perhaps, add an example to the man page for these functions showing that you can use fct_infreq() and its friends to do the ordering. However, doing this requires a fct_relevel() at the end to ensure "Other" is the last value.
Thank you for a great package! <3
library(forcats)
set.seed(12345)
repeated_states<- rep.int(x=state.name, times= runif(n= length(state.name), min=1, max=300))
sort(table(repeated_states), decreasing=TRUE)
#> repeated_states#> Georgia Minnesota Maryland Texas Pennsylvania #> 296 289 285 278 271 #> Arkansas Alaska Oregon New Mexico South Dakota #> 265 262 260 238 234 #> Utah Arizona Illinois Florida Alabama #> 232 228 220 218 216 #> Mississippi Nebraska North Dakota Missouri Wyoming #> 212 209 204 193 188 #> Rhode Island Nevada Delaware New Jersey Kansas #> 185 163 153 145 139 #> California Massachusetts Tennessee Louisiana Iowa #> 137 136 129 121 117 #> Kentucky Montana Ohio Oklahoma Connecticut #> 117 117 111 109 98 #> Michigan Virginia Vermont New Hampshire North Carolina #> 98 97 78 68 57 #> Maine Colorado Idaho South Carolina Washington #> 54 50 46 41 18 #> Wisconsin West Virginia Hawaii New York Indiana #> 17 13 11 2 1as_fct<- fct_lump_n(repeated_states, 10)
levels(as_fct)
#> [1] "Alaska" "Arkansas" "Georgia" "Maryland" "Minnesota" #> [6] "New Mexico" "Oregon" "Pennsylvania" "South Dakota" "Texas" #> [11] "Other"as_ordered_fct<- fct_lump_n(repeated_states, 10) |> fct_infreq() |> fct_relevel("Other", after=Inf)
levels(as_ordered_fct)
#> [1] "Georgia" "Minnesota" "Maryland" "Texas" "Pennsylvania"#> [6] "Arkansas" "Alaska" "Oregon" "New Mexico" "South Dakota"#> [11] "Other"
The
fct_lump_
functions are all about deciding which levels to keep or lump together based on their frequency, so I think that they should also have an option (e.g.sort = c("no", "asc", "desc")
) to return those levels in an order that is based on their frequency.Edit: Or perhaps, add an example to the man page for these functions showing that you can use
fct_infreq()
and its friends to do the ordering. However, doing this requires afct_relevel()
at the end to ensure "Other" is the last value.Thank you for a great package! <3
Created on 2024-12-17 with reprex v2.1.1
The text was updated successfully, but these errors were encountered: