Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Distributed] make finalizer messages threadsafe #42240

Merged
merged 2 commits into from
Sep 23, 2021

Conversation

vchuravy
Copy link
Member

Partial revert of #41722

Fixes #42126

@Sacha0
Copy link
Member

Sacha0 commented Sep 13, 2021

Tentatively marking backport-1.7, given the issue this patch targets has been observed on 1.7 previously.

Copy link
Member

@vtjnash vtjnash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From CI, this hangs

@vtjnash
Copy link
Member

vtjnash commented Sep 14, 2021

Looks like this would be vulnerable to a lock priority inversion deadlock situation

@vchuravy vchuravy force-pushed the vc/distributed_ts_again branch from 5f562fd to f8b7064 Compare September 15, 2021 17:02
@vchuravy vchuravy dismissed vtjnash’s stale review September 16, 2021 02:01

Addressed review

@tkf
Copy link
Member

tkf commented Sep 16, 2021

Would using a lock-free queue, stack, or "channel" for the messages help here? It sounds nice to not block and do minimal work inside a finalizer. (You can always schedule a task and so maybe this is not the issue/solution though.)

@vchuravy
Copy link
Member Author

Would using a lock-free queue, stack, or "channel" for the messages help here? It sounds nice to not block and do minimal work inside a finalizer. (You can always schedule a task and so maybe this is not the issue/solution though.)

Yeah that would be a good optimization, but I don't think we have that in base right now?

But even with a lock-free queue, notifying the consumer is a problem.

@tkf
Copy link
Member

tkf commented Sep 16, 2021

For a queue with notification, we can use a lock-free unbounded "channel" (aka dual queue). It's possible to implement lock-free(-ish) push! even when popfirst! can wait. One catch is that push! needs to be able to run schedule 1 but it's already usable in the finalizer. I think I can create a PR for adding the queue in this: Scherer & Scott (2004) Nonblocking Concurrent Data Structures with Condition Synchronization https://link.springer.com/chapter/10.1007/978-3-540-30186-8_13

Footnotes

  1. It's also conceivable to use lock-free(-ish) queues inside the Julia runtime. It'd then make push! "more" lock-free.

@vchuravy vchuravy merged commit eb1d6b3 into master Sep 23, 2021
@vchuravy vchuravy deleted the vc/distributed_ts_again branch September 23, 2021 20:22
@vchuravy
Copy link
Member Author

@KristofferC I would like to let this bake on master for a while before backporting. Should I remove the tag until a week or two from now?

@KristofferC
Copy link
Member

Sounds good.

LilithHafner pushed a commit to LilithHafner/julia that referenced this pull request Mar 8, 2022
@iamed2
Copy link
Contributor

iamed2 commented Jun 22, 2022

Is this backportable to 1.6? We are experiencing #42126 (with the old error type) there.

vchuravy added a commit to JuliaLang/Distributed.jl that referenced this pull request Oct 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stdlib Julia's standard library
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Distributed] error in running finalizer: ConcurrencyViolationError(msg="lock must be held") (any_gc_flag)
6 participants