-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-8603][SPARKR] Use shell() instead of system2() for SparkR on Windows #13165
Conversation
Please let me cc @sun-rui and @JoshRosen |
Could you add a test case for Windows specific test? @shivaram |
Test build #58758 has finished for PR 13165 at commit
|
Test build #58760 has finished for PR 13165 at commit
|
# of the operating system. So, this should be manaully changed. | ||
determinefileSeperator <- function() { | ||
if (.Platform$OS.type == "windows") { | ||
fileSeperator <- "\\" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: separator not seperator
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much.
Test build #58763 has finished for PR 13165 at commit
|
Does this apply to other cases: spark/R/pkg/inst/worker/daemon.R Line 22 in d6dc12e
spark/R/pkg/inst/profile/shell.R Line 20 in d6dc12e
spark/examples/src/main/r/dataframe.R Line 37 in 6ab4d9e
(last one is an example) |
@sun-rui @felixcheung Let me try to build and run all tests for R first in Windows and then will try to correct and add each test one by one. This will take a bit of time and I might have to ask a lot of questions (I am not familiar with R and Windows as I said in the stale PR above, #7025) but anyway I will try. Meanwhile, If anybody wants to take over this, please do so. |
@felixcheung, this issue seems to relate to system2() only. However, let's wait for HyukjinKwon's test result. @HyukjinKwon, great, go ahead please. |
@sun-rui @felixcheung Right. It seems finally I made it. I made gists and uploaded a PDF file of Spark UI to my google drive. Let me tell you the test results first. The test was proceeded against master branch. Here is the stdout output for the tests on Windwos 7 32bit, output.msg. Here are the steps I took:
|
This raises some questions to me.
|
|
I checked the source code of system2. Since a batch file with .cmd extension is to be launched on Windows, I think it is more formal to use shell() instead of system2(). Refer to http://www.astrostatistics.psu.edu/datasets/R/html/base/html/shell.html. The proposed fix is like follows:
|
@sun-rui Thank you so much. I will try to add a test. |
Related to my comment on #13217 -- I will test this out on windows using the new instructions. |
@shivaram Thank you so much. Let me try to add a test meanwhile. |
@shivaram and @sun-rui I addressed the comment, added a test for this PR and fixed some tests not working on Windows due to path separator. This PR also fixes the test failures below.
Here is full stdout of my testing, output.msg on Windows. |
Test build #59275 has finished for PR 13165 at commit
|
Test build #59276 has finished for PR 13165 at commit
|
@@ -60,6 +60,15 @@ generateSparkSubmitArgs <- function(args, sparkHome, jars, sparkSubmitOpts, pack | |||
combinedArgs | |||
} | |||
|
|||
determineLauncher <- function(sparkSubmitBin, combinedArgs, capture = FALSE) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can this be a common utility function and be re-used in test_includeJAR.R?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. I will.
BTW, could you inform me if I can disable style checking for some blocks in R like Scala? It seems it fails due to the usages of shell
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use nolint
- See context.R
for an example. BTW what is the lint error here ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shivaram For some reasons, it seems there is no mothod shell()
on Linux (I tested this on my Mac) but on Windows. I am using this twice within this PR. Here is a thread for this, http://r.789695.n4.nabble.com/Shell-Function-not-on-Linux-td4672569.html.
I could not find the official documentation though. Let me address the comments first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the common function could be like:
launchScript(script) {
if (windowsOS) use shell()
else (system2()
}
This makes it more general, not binding to launching SparkSubmit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sun-rui Thank you so much. I was trying to come up with a good name.
Test build #59339 has finished for PR 13165 at commit
|
res <- system2(command = submitPath, | ||
args = c(jarPath, scriptPath), | ||
stdout = TRUE) | ||
if (.Platform$OS.type == "windows") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can call determineSparkSubmitBin() here
Test build #59363 has finished for PR 13165 at commit
|
LGTM |
could you rebase it to latest master. It is hopeful other test failures will gone |
Sure, Thanks! |
Test build #59367 has finished for PR 13165 at commit
|
Test build #59383 has finished for PR 13165 at commit
|
@HyukjinKwon Thanks for the update. I verified this today using a Windows VM and followed the instructions you have written up as well. All the tests passed and the stdout I got is at https://gist.github.com/shivaram/bd3c8bf7466d6cd515f4877f2a9611b3 One minor thing - there is a typo in the last line of https://github.com/apache/spark/blob/branch-2.0/R/WINDOWS.md Could you make that change as a part of this PR ? |
@shivaram Thank you so much! I will keep this machine as it is anyway. Please cc me if there are some PRs dealing with Windows in the future. I can run tests and leave some comments. |
Test build #59439 has finished for PR 13165 at commit
|
Merging this to master and branch-2.0 |
…indows ## What changes were proposed in this pull request? This PR corrects SparkR to use `shell()` instead of `system2()` on Windows. Using `system2(...)` on Windows does not process windows file separator `\`. `shell(tralsate = TRUE, ...)` can treat this problem. So, this was changed to be chosen according to OS. Existing tests were failed on Windows due to this problem. For example, those were failed. ``` 8. Failure: sparkJars tag in SparkContext (test_includeJAR.R#34) 9. Failure: sparkJars tag in SparkContext (test_includeJAR.R#36) ``` The cases above were due to using of `system2`. In addition, this PR also fixes some tests failed on Windows. ``` 5. Failure: sparkJars sparkPackages as comma-separated strings (test_context.R#128) 6. Failure: sparkJars sparkPackages as comma-separated strings (test_context.R#131) 7. Failure: sparkJars sparkPackages as comma-separated strings (test_context.R#134) ``` The cases above were due to a weird behaviour of `normalizePath()`. On Linux, if the path does not exist, it just prints out the input but it prints out including the current path on Windows. ```r # On Linus path <- normalizePath("aa") print(path) [1] "aa" # On Windows path <- normalizePath("aa") print(path) [1] "C:\\Users\\aa" ``` ## How was this patch tested? Jenkins tests and manually tested in a Window machine as below: Here is the [stdout](https://gist.github.com/HyukjinKwon/4bf35184f3a30f3bce987a58ec2bbbab) of testing. Closes #7025 Author: hyukjinkwon <[email protected]> Author: Hyukjin Kwon <[email protected]> Author: Prakash PC <[email protected]> Closes #13165 from HyukjinKwon/pr/7025. (cherry picked from commit 1c40373) Signed-off-by: Shivaram Venkataraman <[email protected]>
…ting on Windows (currently SparkR only) ## What changes were proposed in this pull request? This PR adds the build automation on Windows with [AppVeyor](https://www.appveyor.com/) CI tool. Currently, this only runs the tests for SparkR as we have been having some issues with testing Windows-specific PRs (e.g. apache#14743 and apache#13165) and hard time to verify this. One concern is, this build is dependent on [steveloughran/winutils](https://github.com/steveloughran/winutils) for pre-built Hadoop bin package (who is a Hadoop PMC member). ## How was this patch tested? Manually, https://ci.appveyor.com/project/HyukjinKwon/spark/build/88-SPARK-17200-build-profile This takes roughly 40 mins. Some tests are already being failed and this was found in apache#14743 (comment). Author: hyukjinkwon <[email protected]> Closes apache#14859 from HyukjinKwon/SPARK-17200-build.
What changes were proposed in this pull request?
This PR corrects SparkR to use
shell()
instead ofsystem2()
on Windows.Using
system2(...)
on Windows does not process windows file separator\
.shell(tralsate = TRUE, ...)
can treat this problem. So, this was changed to be chosen according to OS.Existing tests were failed on Windows due to this problem. For example, those were failed.
The cases above were due to using of
system2
.In addition, this PR also fixes some tests failed on Windows.
The cases above were due to a weird behaviour of
normalizePath()
. On Linux, if the path does not exist, it just prints out the input but it prints out including the current path on Windows.How was this patch tested?
Jenkins tests and manually tested in a Window machine as below:
Here is the stdout of testing.
Closes #7025