Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[egui_web] Prevent event handlers from running if code has panicked #1306

Merged
merged 6 commits into from
Mar 10, 2022
Merged

[egui_web] Prevent event handlers from running if code has panicked #1306

merged 6 commits into from
Mar 10, 2022

Conversation

DusterTheFirst
Copy link
Contributor

@DusterTheFirst DusterTheFirst commented Feb 25, 2022

The way egui_web works involves adding event handlers to a <canvas> element, the <body>, and a helper <input> element. By setting up such callbacks, it creates a position where the JavaScript/WASM runtime will call rust code across a FFI boundary, but that rust code could be in an invalid state, due to a previous panic. More information on the issue can be seen in #1209.

There are 2 ways to prevent this problem from occuring:

  1. Provide a way to de-register all of the callbacks that were setup which the user could call in a panic handler, or std::panic::catch_unwind
  2. Leave callbacks registered, but return early from all of them if the code has encountered a panic.

The first option would be preferable since it would allow for the user to have more control over the de-registering of the callbacks. This was my first attempt but it became hard to near impossible.

  1. Lots of the types used in egui and egui_web are not UnwindSafe or RefUnwindSafe which make using std::panic::catch_unwind impossible.
  2. None of the types from wasm_bindgen, web_sys, or js_sys are Send. This means that there would be no way for a panic handler (std::panic::set_hook) to be passed the js_sys::Function or even the web_sys::EventTarget. both of which are needed to call remove_event_listener.

Both of the above limitations mean that the event listeners can not be de-registered from inside of a panic handler. This leaves only option 2.

The way I have implemented this is using a shared Arc<AtomicBool> between all of the event handlers and the panic handler. Every time the event handlers are called, they perform a AtomicBool::load(SeqCst) and check that the bool is false. If it is, they continue calling the normal rust code. If the bool was true, the event handler returns early. The AtomicBool is set to true by the panic handler as soon as it is hit, so all callbacks after will be skipped.

In making this change, I created a convenience method that will wrap the callbacks in a check against the atomic bool as well as reduce boilerplate with closure creation, event listener adding, and closure forgetting.

There is only one event listener I did not touch, and that is the contextmenu event handler. Since this handler does not call into any egui code, it does not contribute to the problem and can be left alone.

The main culprit of the errors is the locking of the mutex guarding the AppRunner.

Here is a link to a few important changes:

And a before/after comparison:

Before:

before.mp4

After:

after.mp4

As can be seen in the after demo, there is still one unreachable executed error, but this is caused by the builtin rust panic handler calling unsupported code so there is nothing I can do about it.

Closes #1290.

@DusterTheFirst DusterTheFirst marked this pull request as ready for review February 25, 2022 20:32
Copy link
Contributor

@Titaniumtown Titaniumtown left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM works fine on my end. Very useful imo!

@Titaniumtown
Copy link
Contributor

Actually. I found an issue:
image

Doesn't occur on the master branch.

@DusterTheFirst
Copy link
Contributor Author

@Titaniumtown can you show the code that caused this panic so I can look into it?

Copy link
Owner

@emilk emilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good - thanks!

egui_web/src/lib.rs Show resolved Hide resolved
impl AppRunnerContainer {
/// Convenience function to reduce boilerplate and ensure that all event handlers
/// are dealt with in the same way
pub fn add_event_listener<E: wasm_bindgen::JsCast>(
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great!

@emilk
Copy link
Owner

emilk commented Mar 7, 2022

I'll wait with merging this until @Titaniumtown gets back

@Titaniumtown
Copy link
Contributor

Oh I'll send a code snippet in a bit.

@Titaniumtown
Copy link
Contributor

Ok. So when interacting with buttons and an input text box like so: https://github.com/Titaniumtown/integral_site/blob/d48b94c1d2a43d0fab4ebba630d315f82cd5553b/src/egui_app.rs#L277 then https://github.com/Titaniumtown/integral_site/blob/d48b94c1d2a43d0fab4ebba630d315f82cd5553b/src/egui_app.rs#L298 and interacting with one, then another. I get that error. This error doesn't occur on the master branch. It seems to be related to the entire thing being wrapped in a ui.horizontal as all other widgets work perfectly. Very strange.

@Titaniumtown
Copy link
Contributor

Titaniumtown commented Mar 8, 2022

It works now! (at least for me)
More testing is probably needed though imo.

@DusterTheFirst
Copy link
Contributor Author

DusterTheFirst commented Mar 8, 2022

@Titaniumtown I think I have fixed your panic issue with the new commits I have pushed. Please try them out on your setup to make sure it works not just on my machine.

The root cause is some very weird wasm/js-engine tomfoolery that must be going on, causing an event handler to interrupt the text_agent::update_text_agent call with another event handler, not letting the current event handler to finish and drop its mutex guard. It is weird that this did not happen before my changes as this "issue" should have still existed. Weird. You can see the fix here: https://github.com/DusterTheFirst/egui/blob/3e97e51fc3d9753135478f0a8ac0643c95943b4d/egui_web/src/text_agent.rs#L172-L184

Note: the CI has started failing, but this seems to be caused by a new advisory for the regex crate being added to rustsec's database

@emilk
Copy link
Owner

emilk commented Mar 10, 2022

I've tested it on mobile, and mobile text input still works, so… looks good to me!

@emilk emilk merged commit 5d950e1 into emilk:master Mar 10, 2022
@DusterTheFirst DusterTheFirst deleted the web-panic branch March 19, 2022 17:36
@emilk
Copy link
Owner

emilk commented Apr 15, 2022

I just noticed a bug caused by this PR: resizing the browser window will no longer immediately resize the egui canvas, making it feel VERY unresponsive.

emilk pushed a commit that referenced this pull request Apr 16, 2022
emilk pushed a commit to emilk/egui_plot that referenced this pull request Jul 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

eframe on the web (egui_web) creates a mess on panic
3 participants