-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-3988][SQL] add public API for date type #2901
Conversation
QA tests have started for PR 2901 at commit
|
QA tests have finished for PR 2901 at commit
|
Test FAILed. |
QA tests have started for PR 2901 at commit
|
@@ -76,6 +79,7 @@ public void constructSimpleRow() { | |||
new Boolean(booleanValue), | |||
stringValue, // StringType | |||
binaryValue, // BinaryType | |||
dateValue, // DateType |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: indentation is off
LGTM, thanks! |
QA tests have finished for PR 2901 at commit
|
Test FAILed. |
QA tests have started for PR 2901 at commit
|
QA tests have started for PR 2901 at commit
|
QA tests have finished for PR 2901 at commit
|
Test PASSed. |
QA tests have finished for PR 2901 at commit
|
Test PASSed. |
@@ -1065,7 +1074,9 @@ def applySchema(self, rdd, schema): | |||
[Row(field1=1, field2=u'row1'),..., Row(field1=3, field2=u'row3')] | |||
|
|||
>>> from datetime import datetime | |||
>>> from datetime import date |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: these two lines can be combined.
Thanks for fix so many typos! It will be awesome to recognize all Date/Timestamps values in JsonRDD. If it's not easy to do it in this PR, we could do it in another one. |
... lambda x: (x.byte1, x.byte2, x.short1, x.short2, x.int, x.float, x.date, | ||
... x.time, x.map["a"], x.struct.b, x.list, x.null)) | ||
>>> results.collect()[0] # doctest: +NORMALIZE_WHITESPACE | ||
(127, -128, -32768, 32767, 2147483647, 1.0, datetime.datetime(2010, 1, 1, 0, 0), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@davies because of using pyrolite, java.sql.Date
is serialized in the same way as java.sql.Timestamp
, since they are all subtype of java.util.Date
. And this make the dumps()
function to generate datetime instead of date for java.util.Date
. I think this is related to your comments in JIRA SPARK-2674
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see. After the data was deserialized in Python, we need to some data coversions, so we can convert datetime to date if DataType is DateType.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so should the convert in python side or scala side, which one would you prefer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can only do it in Python side.
Test build #22133 has started for PR 2901 at commit
|
Test build #22133 has finished for PR 2901 at commit
|
Test PASSed. |
Test build #440 has started for PR 2901 at commit
|
Test build #440 has finished for PR 2901 at commit
|
retest this please. |
Test build #22215 has started for PR 2901 at commit
|
Test build #22215 has finished for PR 2901 at commit
|
Test PASSed. |
@davies does anything else remain to be done here? |
@marmbrus There is a bug: DateType object will be datetime in Python. Also we could improve DataType/TimestampType support in jsonRDD, it could be done separately. @adrian-wang You could convert |
@adrian-wang You can do it like this:
|
@davies |
Test build #22274 has started for PR 2901 at commit
|
LGTM, waiting for the tests. |
Test build #22274 has finished for PR 2901 at commit
|
Test PASSed. |
Thanks! Merged to master. |
This implement the feature davies mentioned in #2901 (diff) Author: Daoyuan Wang <[email protected]> Closes #3012 from adrian-wang/iso8601 and squashes the following commits: 50df6e7 [Daoyuan Wang] json data timestamp ISO8601 support
This implement the feature davies mentioned in #2901 (diff) Author: Daoyuan Wang <[email protected]> Closes #3012 from adrian-wang/iso8601 and squashes the following commits: 50df6e7 [Daoyuan Wang] json data timestamp ISO8601 support (cherry picked from commit a1fc059) Signed-off-by: Michael Armbrust <[email protected]>
Add json and python api for date type.
By using Pickle,
java.sql.Date
was serialized as calendar, and recognized in python asdatetime.datetime
.