Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

case_when should return an error with zero length RHS values #4170

Closed
BenjaminLouis opened this issue Feb 8, 2019 · 6 comments
Closed

case_when should return an error with zero length RHS values #4170

BenjaminLouis opened this issue Feb 8, 2019 · 6 comments

Comments

@BenjaminLouis
Copy link

With this three examples of case_when utitlisation, I exepected to see examples 1 and 3 behavior but not the second one

library(dplyr, warn.conflicts = FALSE)
library(stringr)

xx <- c("a", "b")
zz <- c("a", "b", "Autre")
# Example 1
case_when(
  length(setdiff(xx, zz)) == 0 ~ NA_character_,
  length(setdiff(xx, zz)) != 0 ~ str_c(xx, collapse = ";")
)
#> [1] NA

# Example 2
case_when(
  length(setdiff(xx, zz)) == 0 ~ NA_character_,
  length(setdiff(xx, zz)) != 0 ~ str_c(setdiff(xx, zz), collapse = ";")
)
#> character(0)

# Example 3
case_when(
  length(setdiff(xx, zz)) == 0 ~ NA_character_,
  length(setdiff(xx, zz)) != 0 ~ paste(setdiff(xx, zz), collapse = ";")
)
#> [1] NA

I think the problem is related to issue #3246 since str_c(character(0), collpase = ";") returns character(0) when paste(character(0), collpase = ";") return "". Maybe it should be useful for users of case_when that the function returns an error when RHS values have zero length.

@krlmlr
Copy link
Member

krlmlr commented Feb 15, 2019

The current behavior of case_when() seems consistent with the tidyverse recycling rules: length 1 can be recycled, everything else (including length 0) remains unchanged.

Can you please simplify the reprex to use constants instead of complicated expressions?

@cderv
Copy link
Contributor

cderv commented Feb 17, 2019

Here is a minimal reprex of what I believe is the issue

library(dplyr, warn.conflicts = FALSE)
packageVersion("dplyr")
#> [1] '0.8.0.9000'
xx <- c("a")
dplyr::case_when(
  xx  == "a" ~ "A",
  xx  == "b" ~ character(0)
)
#> character(0)

Created on 2019-02-17 by the reprex package (v0.2.1)

Should it be here an error or "A" ?
It shouldn't returns character(0) I think.

@krlmlr
Copy link
Member

krlmlr commented Feb 17, 2019

Thanks, Christophe. I still think this is consistent with the recycling rules, but I agree it's confusing. We accept vectors on each LHS and RHS, and will recycle length-1 vectors to a common length, if possible. In your example, 0 is the only length different from 1, and this is what is used for the result:

library(dplyr, warn.conflicts = FALSE)

my_recode <- function(xx, yy) {
  dplyr::case_when(
    xx == "a" ~ "A",
    xx == "b" ~ yy
  )
}

my_recode("a", "B")
#> [1] "A"
my_recode("a", LETTERS)
#>  [1] "A" "A" "A" "A" "A" "A" "A" "A" "A" "A" "A" "A" "A" "A" "A" "A" "A"
#> [18] "A" "A" "A" "A" "A" "A" "A" "A" "A"
my_recode("a", character(0))
#> character(0)
my_recode(letters, character(0))
#> Error: `xx == "b" ~ yy` must be length 26 or one, not 0
#> Backtrace:
#>     █
#>  1. └─global::my_recode(letters, character(0))
#>  2.   └─dplyr::case_when(xx == "a" ~ "A", xx == "b" ~ yy)
#>  3.     └─dplyr:::validate_case_when_length(query, value, fs) /home/kirill/git/R/dplyr/R/case_when.R:160:2
#>  4.       └─dplyr:::bad_calls(...) /home/kirill/git/R/dplyr/R/case_when.R:230:2
#>  5.         └─dplyr:::glubort(fmt_calls(calls), ..., .envir = .envir) /home/kirill/git/R/dplyr/R/error.R:29:2

Created on 2019-02-17 by the reprex package (v0.2.1.9000)

@cderv
Copy link
Contributor

cderv commented Feb 17, 2019

Ok it is clearer. I believe this is the part of the documentation that explains it:

The case of n == 0 is treated as a variant of n != 1
Considering the tidyverse recycling rule, case_when behavior is ok.

But yes, it is a bit confusing to get character(0) as the result because we expect something, not nothing as first condition is met.

Maybe some improvement of documentation is needed ?
Some examples and your explanation

the tidyverse recycling rules: length 1 can be recycled, everything else (including length 0) remains unchanged.

@hadley
Copy link
Member

hadley commented May 27, 2019

We'll handle this centrally for the tidyverse: tidyverse/design#13

@hadley hadley closed this as completed May 27, 2019
@lock
Copy link

lock bot commented Nov 23, 2019

This old issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue. https://reprex.tidyverse.org/

@lock lock bot locked and limited conversation to collaborators Nov 23, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants