Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-38563][PYTHON] Upgrade to Py4J 0.10.9.5 #35907

Closed
wants to merge 1 commit into from

Conversation

HyukjinKwon
Copy link
Member

What changes were proposed in this pull request?

This PR is a retry of #35871 with bumping up the version to 0.10.9.5.
It was reverted because of Python 3.10 is broken, and Python 3.10 was not officially supported in Py4J.

In Py4J 0.10.9.5, the issue was fixed (py4j/py4j#475), and it added Python 3.10 support officially with CI set up (py4j/py4j#477).

Why are the changes needed?

See #35871

Does this PR introduce any user-facing change?

See #35871

How was this patch tested?

Py4J sets up Python 3.10 CI now, and I manually tested PySpark with Python 3.10 with this patch:

./bin/pyspark
import py4j
py4j.__version__
spark.range(10).show()
Using Python version 3.10.0 (default, Mar  3 2022 03:57:21)
Spark context Web UI available at http://172.30.5.50:4040
Spark context available as 'sc' (master = local[*], app id = local-1647571387534).
SparkSession available as 'spark'.
>>> import py4j
>>> py4j.__version__
'0.10.9.5'
>>> spark.range(10).show()
+---+
| id|
+---+
...

@HyukjinKwon
Copy link
Member Author

cc @dongjoon-hyun @wangyum FYI

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, we cannot use the same JIRA ID because branch-3.2 has Py4J 0.10.9.4 already with SPARK-38563. Could you use a new JIRA ID for Py4J 0.10.9.5? You can still land it to branch-3.2 too.

@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented Mar 18, 2022

SPARK-38563 solved the resource leakage in branch-3.2 and new JIRA adds Python 3.10 support (on top of it)

@HyukjinKwon
Copy link
Member Author

Oh no. That's not released yet. i reverted it from branch-3.2 too.

@HyukjinKwon
Copy link
Member Author

HyukjinKwon commented Mar 18, 2022

BTW, Python 3.10 already works with Spark 3.2 too - Py4J upgrade broke that (unofficial) support.

@HyukjinKwon
Copy link
Member Author

Technically what you said is correct because branch-3.2 only officially supports Python up to 3.9 but I will port this together to branch-3.2 if you don't mind just to reduce the breakage (although it's not official). The risk here is very small.

@dongjoon-hyun
Copy link
Member

Oh, got it. If you reverted cleanly from all branches, there is no problem.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM.

@HyukjinKwon
Copy link
Member Author

Merged to master, branch-3.3 and branch-3.2.

HyukjinKwon added a commit that referenced this pull request Mar 18, 2022
### What changes were proposed in this pull request?

This PR is a retry of #35871 with bumping up the version to 0.10.9.5.
It was reverted because of Python 3.10 is broken, and Python 3.10 was not officially supported in Py4J.

In Py4J 0.10.9.5, the issue was fixed (py4j/py4j#475), and it added Python 3.10 support officially with CI set up (py4j/py4j#477).

### Why are the changes needed?

See #35871

### Does this PR introduce _any_ user-facing change?

See #35871

### How was this patch tested?

Py4J sets up Python 3.10 CI now, and I manually tested PySpark with Python 3.10 with this patch:

```bash
./bin/pyspark
```

```
import py4j
py4j.__version__
spark.range(10).show()
```

```
Using Python version 3.10.0 (default, Mar  3 2022 03:57:21)
Spark context Web UI available at http://172.30.5.50:4040
Spark context available as 'sc' (master = local[*], app id = local-1647571387534).
SparkSession available as 'spark'.
>>> import py4j
>>> py4j.__version__
'0.10.9.5'
>>> spark.range(10).show()
+---+
| id|
+---+
...
```

Closes #35907 from HyukjinKwon/SPARK-38563-followup.

Authored-by: Hyukjin Kwon <[email protected]>
Signed-off-by: Hyukjin Kwon <[email protected]>
(cherry picked from commit 97335ea)
Signed-off-by: Hyukjin Kwon <[email protected]>
HyukjinKwon added a commit that referenced this pull request Mar 18, 2022
This PR is a retry of #35871 with bumping up the version to 0.10.9.5.
It was reverted because of Python 3.10 is broken, and Python 3.10 was not officially supported in Py4J.

In Py4J 0.10.9.5, the issue was fixed (py4j/py4j#475), and it added Python 3.10 support officially with CI set up (py4j/py4j#477).

See #35871

See #35871

Py4J sets up Python 3.10 CI now, and I manually tested PySpark with Python 3.10 with this patch:

```bash
./bin/pyspark
```

```
import py4j
py4j.__version__
spark.range(10).show()
```

```
Using Python version 3.10.0 (default, Mar  3 2022 03:57:21)
Spark context Web UI available at http://172.30.5.50:4040
Spark context available as 'sc' (master = local[*], app id = local-1647571387534).
SparkSession available as 'spark'.
>>> import py4j
>>> py4j.__version__
'0.10.9.5'
>>> spark.range(10).show()
+---+
| id|
+---+
...
```

Closes #35907 from HyukjinKwon/SPARK-38563-followup.

Authored-by: Hyukjin Kwon <[email protected]>
Signed-off-by: Hyukjin Kwon <[email protected]>
(cherry picked from commit 97335ea)
Signed-off-by: Hyukjin Kwon <[email protected]>
kazuyukitanimura pushed a commit to kazuyukitanimura/spark that referenced this pull request Aug 10, 2022
This PR is a retry of apache#35871 with bumping up the version to 0.10.9.5.
It was reverted because of Python 3.10 is broken, and Python 3.10 was not officially supported in Py4J.

In Py4J 0.10.9.5, the issue was fixed (py4j/py4j#475), and it added Python 3.10 support officially with CI set up (py4j/py4j#477).

See apache#35871

See apache#35871

Py4J sets up Python 3.10 CI now, and I manually tested PySpark with Python 3.10 with this patch:

```bash
./bin/pyspark
```

```
import py4j
py4j.__version__
spark.range(10).show()
```

```
Using Python version 3.10.0 (default, Mar  3 2022 03:57:21)
Spark context Web UI available at http://172.30.5.50:4040
Spark context available as 'sc' (master = local[*], app id = local-1647571387534).
SparkSession available as 'spark'.
>>> import py4j
>>> py4j.__version__
'0.10.9.5'
>>> spark.range(10).show()
+---+
| id|
+---+
...
```

Closes apache#35907 from HyukjinKwon/SPARK-38563-followup.

Authored-by: Hyukjin Kwon <[email protected]>
Signed-off-by: Hyukjin Kwon <[email protected]>
(cherry picked from commit 97335ea)
Signed-off-by: Hyukjin Kwon <[email protected]>
@HyukjinKwon HyukjinKwon deleted the SPARK-38563-followup branch January 15, 2024 00:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants