Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-6949] [SQL] [PySpark] Support Date/Timestamp in Column expression #5570

Closed
wants to merge 6 commits into from

Conversation

davies
Copy link
Contributor

@davies davies commented Apr 18, 2015

This PR enable auto_convert in JavaGateway, then we could register a converter for a given types, for example, date and datetime.

There are two bugs related to auto_convert, see [1] and [2], we workaround it in this PR.

[1] py4j/py4j#160
[2] py4j/py4j#161

cc @rxin @JoshRosen

@@ -2267,6 +2267,8 @@ def _prepare_for_python_RDD(sc, command, obj=None):
# The broadcast will have same life cycle as created PythonRDD
broadcast = sc.broadcast(pickled_command)
pickled_command = ser.dumps(broadcast)
# There is a bug in py4j.java_gateway.JavaClass with auto_convert
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you document what bug it is?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a link here.

@rxin
Copy link
Contributor

rxin commented Apr 18, 2015

@JoshRosen can you take a look at this? I don't really know the py4j stuff.

@SparkQA
Copy link

SparkQA commented Apr 18, 2015

Test build #30514 has finished for PR 5570 at commit 3c373f3.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class DateConverter(object):
    • class DatetimeConverter(object):
  • This patch does not change any dependencies.

@SparkQA
Copy link

SparkQA commented Apr 18, 2015

Test build #30515 has finished for PR 5570 at commit ceb3779.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class DateConverter(object):
    • class DatetimeConverter(object):
  • This patch does not change any dependencies.

@SparkQA
Copy link

SparkQA commented Apr 18, 2015

Test build #30525 has finished for PR 5570 at commit 2e7566d.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class DateConverter(object):
    • class DatetimeConverter(object):
  • This patch does not change any dependencies.

@rxin
Copy link
Contributor

rxin commented Apr 19, 2015

Looks like the change broke something in MLlib.

@SparkQA
Copy link

SparkQA commented Apr 19, 2015

Test build #30541 has finished for PR 5570 at commit d17d634.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class DateConverter(object):
    • class DatetimeConverter(object):
  • This patch does not change any dependencies.

@SparkQA
Copy link

SparkQA commented Apr 19, 2015

Test build #30545 has finished for PR 5570 at commit eb4fa53.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class DateConverter(object):
    • class DatetimeConverter(object):
  • This patch does not change any dependencies.

@rxin
Copy link
Contributor

rxin commented Apr 21, 2015

Thanks. I'm going to merge this in master.

@asfgit asfgit closed this in ab9128f Apr 21, 2015
nemccarthy pushed a commit to nemccarthy/spark that referenced this pull request Jun 19, 2015
This PR enable auto_convert in JavaGateway, then we could register a converter for a given types, for example, date and datetime.

There are two bugs related to auto_convert, see [1] and [2], we workaround it in this PR.

[1]  py4j/py4j#160
[2] py4j/py4j#161

cc rxin JoshRosen

Author: Davies Liu <[email protected]>

Closes apache#5570 from davies/py4j_date and squashes the following commits:

eb4fa53 [Davies Liu] fix tests in python 3
d17d634 [Davies Liu] rollback changes in mllib
2e7566d [Davies Liu] convert tuple into ArrayList
ceb3779 [Davies Liu] Update rdd.py
3c373f3 [Davies Liu] support date and datetime by auto_convert
cb094ff [Davies Liu] enable auto convert
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants