Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-35498][PYTHON] Add thread target wrapper API for pyspark pin thread mode #32644

Closed

Conversation

WeichenXu123
Copy link
Contributor

What changes were proposed in this pull request?

Add thread target wrapper API for pyspark pin thread mode.

Why are the changes needed?

A helper method which make user easier to write threading code under pin thread mode.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Manual.

@SparkQA
Copy link

SparkQA commented May 24, 2021

Test build #138856 has finished for PR 32644 at commit af94332.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 24, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43378/

@HyukjinKwon
Copy link
Member

Shall we file a JIRA?

@SparkQA
Copy link

SparkQA commented May 24, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43378/

@WeichenXu123 WeichenXu123 changed the title [MINOR] Add thread target wrapper API for pyspark pin thread mode. [SPARK-35498] Add thread target wrapper API for pyspark pin thread mode. May 24, 2021
@HyukjinKwon HyukjinKwon changed the title [SPARK-35498] Add thread target wrapper API for pyspark pin thread mode. [SPARK-35498][PYTHON] Add thread target wrapper API for pyspark pin thread mode May 25, 2021
@HyukjinKwon
Copy link
Member

Merged to master.

HyukjinKwon added a commit that referenced this pull request Jun 20, 2021
…hen starting the thread, and use inheritable thread in the current codebase

### What changes were proposed in this pull request?

This PR is a followup of #32429 and #32644.
I was thinking about creating separate PRs but decided to include all in this PR because it shares the same context, and should be easier to review together.

This PR includes:
- Use `InheritableThread` and `inheritable_thread_target` in the current code base to prevent potential resource leak (since we enabled pinned thread mode by default now at #32429)
- Copy local properties when `start` at `InheritableThread` is called to mimic JVM behaviour. Previously it was copied when `InheritableThread` instance was created (related to #32644).
- #32429 missed one place at `inheritable_thread_target` (https://github.com/apache/spark/blob/master/python/pyspark/util.py#L308). More specifically, I missed one place that should enable pinned thread mode by default.

### Why are the changes needed?

To mimic the JVM behaviour about thread lifecycle

### Does this PR introduce _any_ user-facing change?

Ideally no. One possible case is that users use `InheritableThread` with pinned thread mode enabled.
In this case, the local properties will be copied when starting the thread instead of defining the `InheritableThread` object.
This is a small difference that wouldn't likely affect end users.

### How was this patch tested?

Existing tests should cover this.

Closes #32962 from HyukjinKwon/SPARK-35498-SPARK-35303.

Authored-by: Hyukjin Kwon <[email protected]>
Signed-off-by: Hyukjin Kwon <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants