-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-23435][SPARKR][TESTS] Update testthat to >= 2.0.0 #27359
Conversation
Test build #117385 has finished for PR 27359 at commit
|
Test build #117389 has finished for PR 27359 at commit
|
Test build #117392 has finished for PR 27359 at commit
|
Test build #117395 has finished for PR 27359 at commit
|
Test build #117397 has finished for PR 27359 at commit
|
Test build #117399 has finished for PR 27359 at commit
|
Test build #117402 has finished for PR 27359 at commit
|
Assuming that Windows test passes (it did previously), it is ready for review. Unfortunately I couldn't figure out how to correctly install testthat 2.0 on Jenkins from inside test runner ‒ some configurations that worked locally (on official R-base docker image with 3.1.3, still failed here), and I don't want to throttle Jenkins anymore with my lousy attempts (sorry for that). So I decided to add fallback case, so it runs for now on both testthat 1.x and 2.x. It means that appveyor runs with 2.x and Jenkins with 1.x ‒ I guess it might be considered a good thing for now (@shaneknapp?). The only caveat is that we'll have to pay more attention to AppVeyor builds, until Jenkins is upgraded, as it has become clear, that current setup can silence some errors. |
I am all good as long as we can use the latest |
Following @dongjoon-hyun advice I moved commits fixing test issues discovered here out this PR. Could you take a look at #27362 and see if it makes sense to you? As of the other one, I'll try to debug this over next few days and see if I can find a fix, if not questions remains if it should be a blocker for testthat upgrade. It feel like a bit like sweeping things under the rug. |
@zero323, so basically you couldn't figure out CRAN check to work with R 3.1.3? If that's the case, does it work with the latest of R 3.4.x (at least in your local or any Linux based OS)? Spark will deprecate R lower versions than 3.4 in Spark 3.0 and we will drop it right away after Spark 3.0 release. So, R version in Jenkins will increase to the latest of 3.4.x ideally. |
As long as fix for SPARK-30645 is applied and test for cleanClosure on recursive (SPARK-30629) is disabled everything looks green, i.e.
About I couldn't figure ‒ I tried to |
Test build #117406 has finished for PR 27359 at commit
|
…nal file ### What changes were proposed in this pull request? Reference data for "collect() support Unicode characters" has been moved to an external file, to make test OS and locale independent. ### Why are the changes needed? As-is, embedded data is not properly encoded on Windows: ``` library(SparkR) SparkR::sparkR.session() Sys.info() # sysname release version # "Windows" "Server x64" "build 17763" # nodename machine login # "WIN-5BLT6Q610KH" "x86-64" "Administrator" # user effective_user # "Administrator" "Administrator" Sys.getlocale() # [1] "LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252" lines <- c("{\"name\":\"안녕하세요\"}", "{\"name\":\"您好\", \"age\":30}", "{\"name\":\"こんにちは\", \"age\":19}", "{\"name\":\"Xin chào\"}") system(paste0("cat ", jsonPath)) # {"name":"<U+C548><U+B155><U+D558><U+C138><U+C694>"} # {"name":"<U+60A8><U+597D>", "age":30} # {"name":"<U+3053><U+3093><U+306B><U+3061><U+306F>", "age":19} # {"name":"Xin chào"} # [1] 0 jsonPath <- tempfile(pattern = "sparkr-test", fileext = ".tmp") writeLines(lines, jsonPath) df <- read.df(jsonPath, "json") printSchema(df) # root # |-- _corrupt_record: string (nullable = true) # |-- age: long (nullable = true) # |-- name: string (nullable = true) head(df) # _corrupt_record age name # 1 <NA> NA <U+C548><U+B155><U+D558><U+C138><U+C694> # 2 <NA> 30 <U+60A8><U+597D> # 3 <NA> 19 <U+3053><U+3093><U+306B><U+3061><U+306F> # 4 {"name":"Xin ch<U+FFFD>o"} NA <NA> ``` This can be reproduced outside tests (Windows Server 2019, English locale), and causes failures, when `testthat` is updated to 2.x (#27359). Somehow problem is not picked-up when test is executed on `testthat` 1.0.2. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Running modified test, manual testing. ### Note Alternative seems to be to used bytes, but it hasn't been properly tested. ``` test_that("collect() support Unicode characters", { lines <- markUtf8(c( '{"name": "안녕하세요"}', '{"name": "您好", "age": 30}', '{"name": "こんにちは", "age": 19}', '{"name": "Xin ch\xc3\xa0o"}' )) jsonPath <- tempfile(pattern = "sparkr-test", fileext = ".tmp") writeLines(lines, jsonPath, useBytes = TRUE) expected <- regmatches(lines, regexec('(?<="name": ").*?(?=")', lines, perl = TRUE)) df <- read.df(jsonPath, "json") rdf <- collect(df) expect_true(is.data.frame(rdf)) rdf$name <- markUtf8(rdf$name) expect_equal(rdf$name[1], expected[[1]]) expect_equal(rdf$name[2], expected[[2]]) expect_equal(rdf$name[3], expected[[3]]) expect_equal(rdf$name[4], expected[[4]]) df1 <- createDataFrame(rdf) expect_equal( collect( where(df1, df1$name == expected[[2]]) )$name, expected[[2]] ) }) ``` Closes #27362 from zero323/SPARK-30645. Authored-by: zero323 <[email protected]> Signed-off-by: HyukjinKwon <[email protected]>
…nal file ### What changes were proposed in this pull request? Reference data for "collect() support Unicode characters" has been moved to an external file, to make test OS and locale independent. ### Why are the changes needed? As-is, embedded data is not properly encoded on Windows: ``` library(SparkR) SparkR::sparkR.session() Sys.info() # sysname release version # "Windows" "Server x64" "build 17763" # nodename machine login # "WIN-5BLT6Q610KH" "x86-64" "Administrator" # user effective_user # "Administrator" "Administrator" Sys.getlocale() # [1] "LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252" lines <- c("{\"name\":\"안녕하세요\"}", "{\"name\":\"您好\", \"age\":30}", "{\"name\":\"こんにちは\", \"age\":19}", "{\"name\":\"Xin chào\"}") system(paste0("cat ", jsonPath)) # {"name":"<U+C548><U+B155><U+D558><U+C138><U+C694>"} # {"name":"<U+60A8><U+597D>", "age":30} # {"name":"<U+3053><U+3093><U+306B><U+3061><U+306F>", "age":19} # {"name":"Xin chào"} # [1] 0 jsonPath <- tempfile(pattern = "sparkr-test", fileext = ".tmp") writeLines(lines, jsonPath) df <- read.df(jsonPath, "json") printSchema(df) # root # |-- _corrupt_record: string (nullable = true) # |-- age: long (nullable = true) # |-- name: string (nullable = true) head(df) # _corrupt_record age name # 1 <NA> NA <U+C548><U+B155><U+D558><U+C138><U+C694> # 2 <NA> 30 <U+60A8><U+597D> # 3 <NA> 19 <U+3053><U+3093><U+306B><U+3061><U+306F> # 4 {"name":"Xin ch<U+FFFD>o"} NA <NA> ``` This can be reproduced outside tests (Windows Server 2019, English locale), and causes failures, when `testthat` is updated to 2.x (#27359). Somehow problem is not picked-up when test is executed on `testthat` 1.0.2. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Running modified test, manual testing. ### Note Alternative seems to be to used bytes, but it hasn't been properly tested. ``` test_that("collect() support Unicode characters", { lines <- markUtf8(c( '{"name": "안녕하세요"}', '{"name": "您好", "age": 30}', '{"name": "こんにちは", "age": 19}', '{"name": "Xin ch\xc3\xa0o"}' )) jsonPath <- tempfile(pattern = "sparkr-test", fileext = ".tmp") writeLines(lines, jsonPath, useBytes = TRUE) expected <- regmatches(lines, regexec('(?<="name": ").*?(?=")', lines, perl = TRUE)) df <- read.df(jsonPath, "json") rdf <- collect(df) expect_true(is.data.frame(rdf)) rdf$name <- markUtf8(rdf$name) expect_equal(rdf$name[1], expected[[1]]) expect_equal(rdf$name[2], expected[[2]]) expect_equal(rdf$name[3], expected[[3]]) expect_equal(rdf$name[4], expected[[4]]) df1 <- createDataFrame(rdf) expect_equal( collect( where(df1, df1$name == expected[[2]]) )$name, expected[[2]] ) }) ``` Closes #27362 from zero323/SPARK-30645. Authored-by: zero323 <[email protected]> Signed-off-by: HyukjinKwon <[email protected]> (cherry picked from commit 40b1f4d) Signed-off-by: HyukjinKwon <[email protected]>
@zero323, do you mind if I ask to check R 3.4.x latest and |
Not at all, unless you ask for Windows build :) Seriously though, I had enough problems in category "works for me" with this thing, to be careful. Once #27363 I'll rebase, and make another local build in docker to be sure. |
…rsive calls ### What changes were proposed in this pull request? Disabling test for cleaning closure of recursive function. ### Why are the changes needed? As of 9514b82 this test is no longer valid, and recursive calls, even simple ones: ```lead f <- function(x) { if(x > 0) { f(x - 1) } else { x } } ``` lead to ``` Error: node stack overflow ``` This is issue is silenced when tested with `testthat` 1.x (reason unknown), but cause failures when using `testthat` 2.x (issue can be reproduced outside test context). Problem is known and tracked by [SPARK-30629](https://issues.apache.org/jira/browse/SPARK-30629) Therefore, keeping this test active doesn't make sense, as it will lead to continuous test failures, when `testthat` is updated (#27359 / SPARK-23435). ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Existing tests. CC falaki Closes #27363 from zero323/SPARK-29777-FOLLOWUP. Authored-by: zero323 <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
Unfortunately R arrow is not standalone package (like Python one) and it requires system packages with C++ bindings (installing I don't think that really affects the results though, as primary concern was CRAN tests and overall process, and Arrow related code hasn't been touched. |
We cannot say
Especially, for the following.
|
Thank you, I appreciate that. In general, full reproducible is defined by the Dockerfile which is shown at the begging, but to put it here for reference
It can be re-run to confirm that it reflects current state of things. As show in the cast, build are done directly from this head of this branch (signature is verified) and no changes to the codebase, beyond what is proposed in this PR (and we don't touch any Arrow related components here at all). As of skipping Arrow tests - that's default behavior defined in respective test for example here
and following lines. So it is neither failure or result of any source modification. Can we make arrow tests run? Possibly, but:
At this point Spark has no coverage for any intermediate R version (Jenkins runs 3.1 and then we have almost eight years of releases worth gap to 3.6 on AppVeyor), not to mention version-OS combinations. That's troubling, and as work related to this PR shown, can miss obvious errors. but not something that can be really addressed by running ad-hoc tests outside project infrastructure. Anyway... If you have specific concerns about the process used here, and you suspect that proposed changes can lead to problems in the future, I'll do my best to address these. |
? @zero323 . It seems that you missed my point. I advised like the following.
Let me rephrase my words. "In the PR description, write that you didn't run the full test. Especially Arrow tests are skipped". It was my request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Arrow is a valid point although AppVeyor (with R latest and CRAN disabled) tests it. It should be best just to test it out if that's feasible considering the importance of this PR and that changing env in Jenkins is difficult to revert if there's anything wrong.
If there's some technical difficulties, I am okay without testing out because the required codes change expect_true(TRUE)
in the tests seem not required for Arrow tests logically at least.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, given #27359 (comment), I am okay with not trying Arrow installation out.
@shaneknapp made it installing Arrow on R 3.1 (surprisingly ..). Although it's a bit of a while ago and Arrow R was at the early stage, I remember I badly failed to install Arrow in some low R versions.
I think we can verify it later when we actually upgrade R version. Seems it should be easy to fix if there'a anything wrong with Arrow tests.
Also, I found a test case that has a similar structure with Arrow test case as a reference:
spark/R/pkg/tests/fulltests/test_sparkSQL.R
Lines 3327 to 3376 in d69ed9a
test_that("repartition by columns on DataFrame", { | |
# The tasks here launch R workers with shuffles. So, we decrease the number of shuffle | |
# partitions to reduce the number of the tasks to speed up the test. This is particularly | |
# slow on Windows because the R workers are unable to be forked. See also SPARK-21693. | |
conf <- callJMethod(sparkSession, "conf") | |
shufflepartitionsvalue <- callJMethod(conf, "get", "spark.sql.shuffle.partitions") | |
callJMethod(conf, "set", "spark.sql.shuffle.partitions", "5") | |
tryCatch({ | |
df <- createDataFrame( | |
list(list(1L, 1, "1", 0.1), list(1L, 2, "2", 0.2), list(3L, 3, "3", 0.3)), | |
c("a", "b", "c", "d")) | |
# no column and number of partitions specified | |
retError <- tryCatch(repartition(df), error = function(e) e) | |
expect_equal(grepl | |
("Please, specify the number of partitions and/or a column\\(s\\)", retError), TRUE) | |
# repartition by column and number of partitions | |
actual <- repartition(df, 3, col = df$"a") | |
# Checking that at least the dimensions are identical | |
expect_identical(dim(df), dim(actual)) | |
expect_equal(getNumPartitions(actual), 3L) | |
# repartition by number of partitions | |
actual <- repartition(df, 13L) | |
expect_identical(dim(df), dim(actual)) | |
expect_equal(getNumPartitions(actual), 13L) | |
expect_equal(getNumPartitions(coalesce(actual, 1L)), 1L) | |
# a test case with a column and dapply | |
schema <- structType(structField("a", "integer"), structField("avg", "double")) | |
df <- repartition(df, col = df$"a") | |
df1 <- dapply( | |
df, | |
function(x) { | |
y <- (data.frame(x$a[1], mean(x$b))) | |
}, | |
schema) | |
# Number of partitions is equal to 2 | |
expect_equal(nrow(df1), 2) | |
}, | |
finally = { | |
# Resetting the conf back to default value | |
callJMethod(conf, "set", "spark.sql.shuffle.partitions", shufflepartitionsvalue) | |
}) | |
}) |
So I suspect it's good enough to go (except one minor comment #27359 (comment))
Signed-off-by: zero323 <[email protected]>
Seems like. I guess there is something lost in translation here
Especially what we consider "wrong" in this context, hence my elaborate response. Which I think is still valid ‒ overall test matrix is tiny compared to what is officially supported. Sadly Arrow makes things worse as
It would be great if we could either isolate that (I played a bit with Anaconda for that, and results are quite promising, but I am not there yet, not to mention it won't help with all the tech. debt caused by supporting long outdated R and Python versions), or find a way to bundle within Maven builds. For now I published zero323/sparkr-build-sandbox (docker.com/r/zero323/sparkr-build-sandbox) - not exactly what is needed, but does the trick for now. Please check updated description and let me know if that addresses your concerns. |
Test build #117484 has finished for PR 27359 at commit
|
Sorry, but it doesn't address my concerns.
I ask you the following. In the above, I cannot see the following content. Even, you completely hide the word
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@HyukjinKwon . Sorry, but you know that I'm not against that @zero323 skipped some tests due to the difficulty. I fully understand that it's sometime unfair to ask that to the contributors. However, the author of PR should mention what he've done in the PR description correctly.
I'm not sure why @zero323 adds many comments about this difficulty, but he didn't mention any of those difficulty into the PR description which will become a permanent commit log. Could you advise him about that? I already add the same comments repeatedly. So, it's too much for me.
Or, @HyukjinKwon . You can merge this PR as is. It's up to you. I'll not be against your decision.
At this point all tests did run across multiple additional R versions. This is both documented in a persistent and verifiable way, and reproducible. |
@zero323, mind adding a couple of words to describe It's fine to describe what happened during manual test. No one can't be flawless during such tests; it's better to describe it thoroughly for people who follow up later. |
I was actually able to put all the dependencies together, including Arrow.
Indeed. I guess I should have skipped explanations. Anyway, I am happy to close this, and leave for someone more articulate to continue. Everything should be in good shape here from technical perspective, and linked images should provide good starting point for anyone willing to continue. Let's just see if AppVeyor failure is a false positive. Seems like the last run failed without clear indication why. Thanks for your work @HyukjinKwon @dongjoon-hyun and sorry for any misunderstandings on my side. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, then I am going to merge this. LGTM.
Test build #117498 has finished for PR 27359 at commit
|
Merged to master. |
- Update `testthat` to >= 2.0.0 - Replace of `testthat:::run_tests` with `testthat:::test_package_dir` - Add trivial assertions for tests, without any expectations, to avoid skipping. - Update related docs. `testthat` version has been frozen by [SPARK-22817](https://issues.apache.org/jira/browse/SPARK-22817) / apache#20003, but 1.0.2 is pretty old, and we shouldn't keep things in this state forever. No. - Existing CI pipeline: - Windows build on AppVeyor, R 3.6.2, testthtat 2.3.1 - Linux build on Jenkins, R 3.1.x, testthat 1.0.2 - Additional builds with thesthat 2.3.1 using [sparkr-build-sandbox](https://github.com/zero323/sparkr-build-sandbox) on c7ed64a R 3.4.4 (image digest ec9032f8cf98) ``` docker pull zero323/sparkr-build-sandbox:3.4.4 docker run zero323/sparkr-build-sandbox:3.4.4 zero323 --branch SPARK-23435 --commit c7ed64a --public-key https://keybase.io/zero323/pgp_keys.asc ``` 3.5.3 (image digest 0b1759ee4d1d) ``` docker pull zero323/sparkr-build-sandbox:3.5.3 docker run zero323/sparkr-build-sandbox:3.5.3 zero323 --branch SPARK-23435 --commit c7ed64a --public-key https://keybase.io/zero323/pgp_keys.asc ``` and 3.6.2 (image digest 6594c8ceb72f) ``` docker pull zero323/sparkr-build-sandbox:3.6.2 docker run zero323/sparkr-build-sandbox:3.6.2 zero323 --branch SPARK-23435 --commit c7ed64a --public-key https://keybase.io/zero323/pgp_keys.asc ```` Corresponding [asciicast](https://asciinema.org/) are available as 10.5281/zenodo.3629431 [](https://doi.org/10.5281/zenodo.3629431) (a bit to large to burden asciinema.org, but can run locally via `asciinema play`). ---------------------------- Continued from apache#27328 Closes apache#27359 from zero323/SPARK-23435. Authored-by: zero323 <[email protected]> Signed-off-by: HyukjinKwon <[email protected]>
@shaneknapp TR;DL: Now SparkR supports both testthat 1.0.2 and 2.0.0. I will backport this to branch-2.4 at #27379 as well. So, as a reminder:
|
@HyukjinKwon will do. the upgrade won't happen until next week. |
### What changes were proposed in this pull request? This PR backports #27359: - Update `testthat` to >= 2.0.0 - Replace of `testthat:::run_tests` with `testthat:::test_package_dir` - Add trivial assertions for tests, without any expectations, to avoid skipping. - Update related docs. ### Why are the changes needed? `testthat` version has been frozen by [SPARK-22817](https://issues.apache.org/jira/browse/SPARK-22817) / #20003, but 1.0.2 is pretty old, and we shouldn't keep things in this state forever. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? - Existing CI pipeline: - Windows build on AppVeyor, R 3.6.2, testthtat 2.3.1 - Linux build on Jenkins, R 3.1.x, testthat 1.0.2 - Additional builds with thesthat 2.3.1 using [sparkr-build-sandbox](https://github.com/zero323/sparkr-build-sandbox) on c7ed64a R 3.4.4 (image digest ec9032f8cf98) ``` docker pull zero323/sparkr-build-sandbox:3.4.4 docker run zero323/sparkr-build-sandbox:3.4.4 zero323 --branch SPARK-23435 --commit c7ed64a --public-key https://keybase.io/zero323/pgp_keys.asc ``` 3.5.3 (image digest 0b1759ee4d1d) ``` docker pull zero323/sparkr-build-sandbox:3.5.3 docker run zero323/sparkr-build-sandbox:3.5.3 zero323 --branch SPARK-23435 --commit c7ed64a --public-key https://keybase.io/zero323/pgp_keys.asc ``` and 3.6.2 (image digest 6594c8ceb72f) ``` docker pull zero323/sparkr-build-sandbox:3.6.2 docker run zero323/sparkr-build-sandbox:3.6.2 zero323 --branch SPARK-23435 --commit c7ed64a --public-key https://keybase.io/zero323/pgp_keys.asc ```` Corresponding [asciicast](https://asciinema.org/) are available as 10.5281/zenodo.3629431 [](https://doi.org/10.5281/zenodo.3629431) (a bit to large to burden asciinema.org, but can run locally via `asciinema play`). ---------------------------- Continued from #27328 Closes #27379 from HyukjinKwon/testthat-2.0. Authored-by: zero323 <[email protected]> Signed-off-by: HyukjinKwon <[email protected]>
should be easy.
won't be easy. will be disruptive. |
What changes were proposed in this pull request?
testthat
to >= 2.0.0testthat:::run_tests
withtestthat:::test_package_dir
Why are the changes needed?
testthat
version has been frozen by SPARK-22817 / #20003, but 1.0.2 is pretty old, and we shouldn't keep things in this state forever.Does this PR introduce any user-facing change?
No.
How was this patch tested?
Existing CI pipeline:
Additional builds with thesthat 2.3.1 using sparkr-build-sandbox on c7ed64a
R 3.4.4 (image digest ec9032f8cf98)
3.5.3 (image digest 0b1759ee4d1d)
and 3.6.2 (image digest 6594c8ceb72f)
Corresponding asciicast are available as 10.5281/zenodo.3629431
(a bit to large to burden asciinema.org, but can run locally via
asciinema play
).Continued from #27328