-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-13509][SPARK-13507][SQL] Support for writing CSV with a single function call #11389
Conversation
Test build #52042 has finished for PR 11389 at commit
|
@@ -464,6 +464,12 @@ final class DataFrameWriter private[sql](df: DataFrame) { | |||
* format("parquet").save(path) | |||
* }}} | |||
* | |||
* You can set the following JSON-specific options for writing JSON files: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like it's in the wrong place?
Test build #52047 has finished for PR 11389 at commit
|
@@ -453,6 +453,12 @@ final class DataFrameWriter private[sql](df: DataFrame) { | |||
* format("json").save(path) | |||
* }}} | |||
* | |||
* You can set the following JSON-specific options for writing JSON files: | |||
* <li>`compression` or `codec` (default `null`): compression codec to use when saving to file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just say compression, and don't mention codec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually i'd remove codec support from the underlying source code, and only keep it for csv as an undocumented option for backward compatibility.
@rxin Actually, do you think we need the |
It'd be great to fix in a future pr. for this one, let's also fix python? |
@rxin Sure. |
Test build #52151 has finished for PR 11389 at commit
|
LGTM pending tests |
Test build #52158 has finished for PR 11389 at commit
|
retest this please |
Test build #52154 has finished for PR 11389 at commit
|
Hm... It looks a bit weird. I thought I wanted to submit a hot-fix but I found it actually works okay in my local. |
retest this please |
@yhuai Could I ask that you have any clue on this occasional failure?
|
Test build #52160 has finished for PR 11389 at commit
|
As this passes sometimes (e.g. #11016), I will restart. |
retest this please |
Test build #52161 has finished for PR 11389 at commit
|
Test build #52163 has finished for PR 11389 at commit
|
Test build #52167 has finished for PR 11389 at commit
|
Test build #52175 has finished for PR 11389 at commit
|
Test build #52176 has finished for PR 11389 at commit
|
I see that's a problem in new vecterizedreader. I missed the exception message. Looking deeper. |
@rxin Anyway, would you merge this if it looks good? |
Thanks - merging this in master. |
… function call https://issues.apache.org/jira/browse/SPARK-13507 https://issues.apache.org/jira/browse/SPARK-13509 ## What changes were proposed in this pull request? This PR adds the support to write CSV data directly by a single call to the given path. Several unitests were added for each functionality. ## How was this patch tested? This was tested with unittests and with `dev/run_tests` for coding style Author: hyukjinkwon <[email protected]> Author: Hyukjin Kwon <[email protected]> Closes apache#11389 from HyukjinKwon/SPARK-13507-13509.
https://issues.apache.org/jira/browse/SPARK-13507
https://issues.apache.org/jira/browse/SPARK-13509
What changes were proposed in this pull request?
This PR adds the support to write CSV data directly by a single call to the given path.
Several unitests were added for each functionality.
How was this patch tested?
This was tested with unittests and with
dev/run_tests
for coding style