-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-3114] [PySpark] bugfix: disable compression of command #2026
Conversation
compressed commands break Python UDF.
QA tests have started for PR 2026 at commit
|
QA tests have finished for PR 2026 at commit
|
Do you think it's better to disable compression of the command or to update the Spark SQL code to use the compressing serializer? The latter might buy us a small benefit if we expect task closures to be large; on the other hand, maybe the compression wouldn't be worth it for small closures (will it add a lot of time overhead)? |
Opened a new PR that includes this fix, plus a commit to re-enable the sql.py tests. |
There are some test cases in Scala needed to fix before enable compress. We can do it later. |
This fixes SPARK-3114, an issue where we inadvertently broke Python UDFs in Spark SQL. This PR modifiers the test runner script to always run the PySpark SQL tests, irrespective of whether SparkSQL itself has been modified. It also includes Davies' fix for the bug. Closes #2026. Author: Josh Rosen <[email protected]> Author: Davies Liu <[email protected]> Closes #2027 from JoshRosen/pyspark-sql-fix and squashes the following commits: 9af2708 [Davies Liu] bugfix: disable compression of command 0d8d3a4 [Josh Rosen] Always run Python Spark SQL tests. (cherry picked from commit 1f1819b) Signed-off-by: Josh Rosen <[email protected]>
This fixes SPARK-3114, an issue where we inadvertently broke Python UDFs in Spark SQL. This PR modifiers the test runner script to always run the PySpark SQL tests, irrespective of whether SparkSQL itself has been modified. It also includes Davies' fix for the bug. Closes #2026. Author: Josh Rosen <[email protected]> Author: Davies Liu <[email protected]> Closes #2027 from JoshRosen/pyspark-sql-fix and squashes the following commits: 9af2708 [Davies Liu] bugfix: disable compression of command 0d8d3a4 [Josh Rosen] Always run Python Spark SQL tests.
This fixes SPARK-3114, an issue where we inadvertently broke Python UDFs in Spark SQL. This PR modifiers the test runner script to always run the PySpark SQL tests, irrespective of whether SparkSQL itself has been modified. It also includes Davies' fix for the bug. Closes apache#2026. Author: Josh Rosen <[email protected]> Author: Davies Liu <[email protected]> Closes apache#2027 from JoshRosen/pyspark-sql-fix and squashes the following commits: 9af2708 [Davies Liu] bugfix: disable compression of command 0d8d3a4 [Josh Rosen] Always run Python Spark SQL tests.
compressed commands break Python UDF.