Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify the behavior of wgpuInstanceWaitAny called from a callback #360

Closed
eliemichel opened this issue Oct 1, 2024 · 7 comments
Closed
Assignees
Labels
threading Thread safety and WASM-JS threading problems

Comments

@eliemichel
Copy link
Collaborator

eliemichel commented Oct 1, 2024

Question about WGPUFutures: Is it legal to call wgpuInstanceWaitAny from within a callback, and if so, what completion status is reported for the WGPUFuture of the callback we are currently in?

(EDIT: parent issue #199)


FTR, replies got on matrix:

From @Kangz
IMHO it should work, but it might be difficult to implement

From @kainino0x
I think it should work, it's implementable in theory but YMMV with actual implementations
The (now outdated, but mostly correct) design doc says that the completed field on your first WaitAny should have already been set before the callback is called. However at least at that time, I left the answer to your question undefined: https://docs.google.com/document/d/1qJRTJRY318ZEqhK6ryw4P-91FumYQfOnNP6LpANYy6A/edit#bookmark=id.8zqwnqgij106
Until recently Dawn would actually assert-crash in debug mode if you did that, but we took it out

@eliemichel eliemichel added the needs docs Non-trivial API contract needs to be documented. Orthogonal to open vs closed; remove when doc'd. label Oct 1, 2024
@kainino0x
Copy link
Collaborator

kainino0x commented Nov 14, 2024

Thinking on this further, I think it's necessary to make this safe. Sure, we could say it's unsafe to WaitAny(f1) during f1's own callback inside another WaitAny(f1).

  • But what if you WaitAny(f1) while f1's callback is being called by ProcessEvents? If that's unsafe too, it becomes hazardous to use WaitAny on any AllowProcessEvents-mode future in a multithreaded program.

    • What if you WaitAny(f1) while f1's callback is being called spontaneously? You effectively can't WaitAny on any Spontaneous-mode future at all, even in a single-threaded application, because the implementation can be using background threads. (Also, the application can't just add its own locking in the callback - the critical section is larger than the whole callback.)

If those things have to be safe then nested WaitAny may as well have to be re-entrancy-safe too (for both single-threaded nesting and for multithreading).

@kainino0x kainino0x added the threading Thread safety and WASM-JS threading problems label Nov 14, 2024
@kainino0x
Copy link
Collaborator

If those things have to be safe

OTOH they don't necessarily have to be safe. There just would be no way to do them safely.

@kainino0x
Copy link
Collaborator

I formalized the less safe version in #416. This might be too pessimistic though.

cc @lokokung

@lokokung
Copy link
Collaborator

FWIW, I think WaitAny inside a callback should work in our current implementation... (I haven't explicitly tried it, but I do remember ensuring that we release any Instance related locks before calling callbacks which should in turn make it fine.)

However, we still have issues with re-entrancy for device level stuff because of the default global Device lock... If/when we have a separate thread for doing stuff, I think we can defer certain spontaneous callbacks, i.e. Device lost, and that might allow re-entrancy? Would need to think a little more about whether that could work though.

@kainino0x
Copy link
Collaborator

Ah, I didn't think about the fact that there can be things that are safe in multithreaded code but will cause deadlocks in nested code....

@kainino0x kainino0x added the !discuss Needs discussion (at meeting or online) label Nov 28, 2024
@kainino0x
Copy link
Collaborator

Dec 12 meeting:

  • CF: Think wgpu will just work.
  • LK: Should work for us except due to deadlocks on the device lock. E.g. use the device inside the device loss callback. But should work if we do things correctly.
  • CF: We call all callbacks at the end.
  • LK: How do you know when to do callbacks?
  • CF: Have an internal thing called "maintain". Accumulates all the callbacks, drops all the locks, calls the callbacks, and returns.
  • LK: Issue of re-entrancy to the API. Don't remember the details. Should discuss this more though.
  • Goal should be this just works. Won't always work in Dawn for now.
  • LK: Nested timeouts?
  • KN: Timeouts don't count time spent in callbacks, only event waiting.

@kainino0x kainino0x removed the !discuss Needs discussion (at meeting or online) label Dec 12, 2024
@kainino0x kainino0x self-assigned this Dec 12, 2024
@kainino0x
Copy link
Collaborator

Docs change only.

@kainino0x kainino0x removed the needs docs Non-trivial API contract needs to be documented. Orthogonal to open vs closed; remove when doc'd. label Dec 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
threading Thread safety and WASM-JS threading problems
Projects
None yet
Development

No branches or pull requests

3 participants