-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-19324][SPARKR] Spark VJM stdout output is getting dropped in SparkR #16670
Closed
Closed
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a similar test with something getting printed on
stdout
from the JVM ?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could, but unfortunately, we don't actually call launchScript with wait/capture = TRUE
we call wait/capture = FALSE and expect to let console/stdout to leak through, and return NULL.
I'll try to add test for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I've tried, I don't think it would work.
When calling
system2(.., wait = FALSE, capture = "")
the output to stdout is actually from the child process, so I don't think we would be able to see it from the R process.We could redirect it, but then it would be the same as
system2(..., wait = FALSE, capture = TRUE)
but again it wouldn't be what we are normally calling.I think we would need to dig deeper on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see - I expected the stdout = "" to be piping it to the R process stdout. We could also explicitly pass a
fd
to do this pipe ?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I observed running SparkR as a package, I'm not sure we should pipe/redirect the stdout always - it could get very noisy running SparkR from an IDE. On the other hand, often times the result (error) is not enough to debug the issue.
I'd propose we don't redirect stdout by default in keeping the IDE experience cleaner, however we should have an API to "turn this on and off" programmatically on demand. Although it is not clear
system2
supports that though,stdout
is either TRUE (capture to return as a character vector), NULL/FALSE (drop), "" (to the console of that child process), "name" (file name to write into)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an example of what in a R IDE see:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this what they see before this change or after it ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is from #16670 (comment) this is "often times the result (error) is not enough to debug the issue."
our options are:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the second option is fine - i.e not redirecting it by default but happening to share stdout with the R process. That way if say the R IDE has some way to save or view logs from the stdout of R, then users can use that (Does RStudio have something like this ?)
The API to redirect on demand might be useful (it'll be something like setLogLevel ?) but I'm not sure we can change it for an already running JVM ?
Anyways let me review this PR one more time, I think we can discuss the new API in a separate JIRA