Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fallible systems #16589

Merged
merged 25 commits into from
Dec 5, 2024
Merged

Fallible systems #16589

merged 25 commits into from
Dec 5, 2024

Conversation

NthTensor
Copy link
Contributor

@NthTensor NthTensor commented Dec 1, 2024

Objective

Error handling in bevy is hard. See for reference #11562, #10874 and #12660. The goal of this PR is to make it better, by allowing users to optionally return Result from systems as outlined by Cart in #14275 (comment).

Solution

This PR introduces a new ScheuleSystem type to represent systems that can be added to schedules. Instances of this type contain either an infallible BoxedSystem<(), ()> or a fallible BoxedSystem<(), Result>. ScheuleSystem implements System<In = (), Out = Result> and replaces all uses of BoxedSystem in schedules. The async executor now receives a result after executing a system, which for infallible systems is always Ok(()). Currently it ignores this result, but more useful error handling could also be implemented.

Aliases for Error and Result have been added to the bevy_ecs prelude, as well as const OK which new users may find more friendly than Ok(()).

Testing

  • Currently there are not actual semantics changes that really require new tests, but I added a basic one just to make sure we don't break stuff in the future.
  • The behavior of existing systems is totally unchanged, including logging.
  • All of the existing systems tests pass, and I have not noticed anything strange while playing with the examples

Showcase

The following minimal example prints "hello world" once, then completes.

use bevy::prelude::*;

fn main() {
    App::new().add_systems(Update, hello_world_system).run();
}

fn hello_world_system() -> Result {
    println!("hello world");
    Err("string")?;
    println!("goodbye world");
    OK
}

Migration Guide

This change should be pretty much non-breaking, except for users who have implemented their own custom executors. Those users should use ScheduleSystem in place of BoxedSystem<(), ()> and import the System trait where needed. They can choose to do whatever they wish with the result.

Current Work

  • Fix tests & doc comments
  • Write more tests
  • Add examples
  • Draft release notes

Draft Release Notes

As of this release, systems can now return results.

First a bit of background: Bevy has hisotrically expected systems to return the empty type (). While this makes sense in the context of the ecs, it's at odds with how error handling is typically done in rust: returning Result::Error to indicate failure, and using the short-circuiting ? operator to propagate that error up the call stack to where it can be properly handled. Users of functional languages will tell you this is called "monadic error handling".

Not being able to return Results from systems left bevy users with a quandry. They could add custom error handling logic to every system, or manually pipe every system into an error handler, or perhaps sidestep the issue with some combination of fallible assignents, logging, macros, and early returns. Often, users would just litter their systems with unwraps and possible panics.

While any one of these approaches might be fine for a particular user, each of them has their own drawbacks, and none makes good use of the language. Serious issues could also arrise when two different crates used by the same project made different choices about error handling.

Now, by returning results, systems can defer error handling to the application itself. It looks like this:

// Previous, handling internally
app.add_systems(my_system)
fn my_system(window: Query<&Window>) {
   let Ok(window) = query.get_single() else {
       return;
   };
   // ... do something to the window here
}

// Previous, handling externally
app.add_systems(my_system.pipe(my_error_handler))
fn my_system(window: Query<&Window>) -> Result<(), impl Error> {
   let window = query.get_single()?;
   // ... do something to the window here
   Ok(())
}

// Previous, panicking
app.add_systems(my_system)
fn my_system(window: Query<&Window>) {
   let window = query.single();
   // ... do something to the window here
}

// Now 
app.add_systems(my_system)
fn my_system(window: Query<&Window>) -> Result {
    let window = query.get_single()?;
    // ... do something to the window here
    Ok(())
}

There are currently some limitations. Systems must either return () or Result<(), Box<dyn Error + Send + Sync + 'static>>, with no in-between. Results are also ignored by default, and though implementing a custom handler is possible, it involves writing your own custom ecs executor (which is not recomended).

Systems should return errors when they cannot perform their normal behavior. In turn, errors returned to the executor while running the schedule will (eventually) be treated as unexpected. Users and library authors should prefer to return errors for anything that disrupts the normal expected behavior of a system, and should only handle expected cases internally.

We have big plans for improving error handling further:

  • Allowing users to change the error handling logic of the default executors.
  • Adding source tracking and optional backtraces to errors.
  • Possibly adding tracing-levels (Error/Warn/Info/Debug/Trace) to errors.
  • Generally making the default error logging more helpful and inteligent.
  • Adding monadic system combininators for fallible systems.
  • Possibly removing all panicking variants from our api.

@NthTensor NthTensor added C-Feature A new feature, making something new possible A-ECS Entities, components, systems, and events C-Usability A targeted quality-of-life change that makes Bevy easier to use S-Needs-Review Needs reviewer attention (from anyone!) to move forward labels Dec 1, 2024
@bushrat011899
Copy link
Contributor

Even without the actual error handling benefits this provides, just having a more blessed way to use ? in systems will be really nice. I know we can just pipe the Result with current systems, but this will hide that bit of extra boilerplate. This also pairs nicely with making more APIs return Result instead of Option, and also makes panicking variants less important (possibly even removable TBH).

Copy link
Contributor

@bushrat011899 bushrat011899 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this probably needs an example, but I like the approach. Opens up the possibility of having error handlers in the future, which would resolve the to-panic or not to-panic debate entirely. This also lays the groundwork for how fallibility in Commands could work. Really nice work!

crates/bevy_ecs/src/lib.rs Show resolved Hide resolved
crates/bevy_ecs/src/schedule/executor/mod.rs Outdated Show resolved Hide resolved
crates/bevy_ecs/src/schedule/executor/mod.rs Outdated Show resolved Hide resolved
crates/bevy_ecs/src/schedule/executor/multi_threaded.rs Outdated Show resolved Hide resolved
@bushrat011899
Copy link
Contributor

It was mentioned in Discord, but I'll include it here for posterity: with fallible systems getting first-class treatment, there may be room to consider removing the panicking variants of certain functions (e.g., Query::get_entity and Query::entity), since the choice of behaviour could be controlled by a system error handler. This would be a large DX win, since the "proper" methods would get the shorter names, and it'd reduce the API surface area.

@NthTensor
Copy link
Contributor Author

NthTensor commented Dec 2, 2024

there may be room to consider removing the panicking variants of certain functions

That's in line with the third point Cart proposed in #14275 (comment). He indicated then that it was important to land all the related changes in a single release cycle, and I agree. This PR provides his (1), what (2) and (3) look like is up to @alice-i-cecile and the other designated ecs experts.

Copy link
Contributor

@tychedelia tychedelia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing to see how straightforward this is, all things considered. Very excited!

crates/bevy_ecs/src/result.rs Show resolved Hide resolved
@teohhanhui
Copy link

teohhanhui commented Dec 2, 2024

as well as const OK which new users may find more friendly than Ok(()).

Why? This just makes the code more jarring compared to the rest of the Rust ecosystem, and more cognitive load to switch between returning nothing vs. returning some value.

It'd make sense if it's something useful like https://docs.rs/anyhow/latest/anyhow/fn.Ok.html

@NthTensor
Copy link
Contributor Author

This just makes the code more jarring compared to the rest of the Rust ecosystem

In this I am trying to defer to my understanding of Cart's preferences. He uses a const in the linked issue, and I believe has expressed that Ok(()) is sort of confusing and cumbersome. No strong preference here from me really.

));

// Create a new sphere mesh:
let mut sphere_mesh = Sphere::new(1.0).mesh().ico(7)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What a sight to behold. Once proper (user configurable) handlers are added in a follow-up this will be perfect. Bevy APIs can be simplified and made more reliable without any loss in ergonomics (IMO). Adding Ok(()) at the end of a system is a small price to pay that (hopefully) Rust will solve on its own (since the issue isn't specific to Bevy)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah :) this example is really just a minimal placeholder. Once we have handlers hooked up, I intend to go through and update all the examples to use this style (where it makes sense).

@BenjaminBrienen
Copy link
Contributor

@alice-i-cecile requesting removal of the M-Needs-Release-Note label (added in pr description because we don't have a good place for it yet).

Just like migration guides, we should keep this label around even after they're written for searchability and tooling. We might wan to rename that to be more clear though 🤔

M-#[require(ReleaseNote)]

@NthTensor NthTensor marked this pull request as ready for review December 4, 2024 02:50
@NthTensor NthTensor added S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it and removed S-Waiting-on-Author The author needs to make changes or address concerns before this can be merged labels Dec 4, 2024
@NthTensor
Copy link
Contributor Author

Alright, I added the most basic tests and examples in the world. There will be more to do there when handlers are hooked up. Ready for review.

@alice-i-cecile alice-i-cecile added the M-Needs-Migration-Guide A breaking change to Bevy's public API that needs to be noted in a migration guide label Dec 5, 2024
Copy link
Member

@alice-i-cecile alice-i-cecile left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nth this is good to go once it's merge-conflict free. I do prefer the "everything is fallible" approach by WrongShoe, but that's easily left to a follow-up refactor. Let's get the ball rolling here.

@alice-i-cecile alice-i-cecile added this pull request to the merge queue Dec 5, 2024
Merged via the queue into bevyengine:main with commit 0070514 Dec 5, 2024
31 of 32 checks passed
@mockersf mockersf mentioned this pull request Dec 26, 2024
github-merge-queue bot pushed a commit that referenced this pull request Dec 27, 2024
# Objective

- First step for #16718 
- #16589 introduced an api that can only ignore errors, which is risky

## Solution

- Panic instead of just ignoring the errors

## Testing

- Changed the `fallible_systems` example to return an error
```
Encountered an error in system `fallible_systems::setup`: TooManyVertices { subdivisions: 300, number_of_resulting_points: 906012 }
Encountered a panic in system `fallible_systems::setup`!
Encountered a panic in system `bevy_app::main_schedule::Main::run_main`!
```
@Neo-Zhixing
Copy link
Contributor

Might be a bit late on this discussion, but this particular implementation feels a bit intrusive to me. Can we instead implement IntoSystemConfigs on systems that returns Result so that those systems can be added to the schedule like a normal system?

github-merge-queue bot pushed a commit that referenced this pull request Dec 31, 2024
# Objective

- #16589 added an enum to switch between fallible and infallible system.
This branching should be unnecessary if we wrap infallible systems in a
function to return `Ok(())`.

## Solution

- Create a wrapper system for `System<(), ()>`s that returns `Ok` on the
call to `run` and `run_unsafe`. The wrapper should compile out, but I
haven't checked.
- I removed the `impl IntoSystemConfigs for BoxedSystem<(), ()>` as I
couldn't figure out a way to keep the impl without double boxing.

## Testing

- ran `many_foxes` example to check if it still runs.

## Migration Guide

- `IntoSystemConfigs` has been removed for `BoxedSystem<(), ()>`. Either
use `InfallibleSystemWrapper` before boxing or make your system return
`bevy::ecs::prelude::Result`.
ecoskey pushed a commit to ecoskey/bevy that referenced this pull request Jan 6, 2025
# Objective

Error handling in bevy is hard. See for reference
bevyengine#11562,
bevyengine#10874 and
bevyengine#12660. The goal of this PR is
to make it better, by allowing users to optionally return `Result` from
systems as outlined by Cart in
<bevyengine#14275 (comment)>.

## Solution

This PR introduces a new `ScheuleSystem` type to represent systems that
can be added to schedules. Instances of this type contain either an
infallible `BoxedSystem<(), ()>` or a fallible `BoxedSystem<(),
Result>`. `ScheuleSystem` implements `System<In = (), Out = Result>` and
replaces all uses of `BoxedSystem` in schedules. The async executor now
receives a result after executing a system, which for infallible systems
is always `Ok(())`. Currently it ignores this result, but more useful
error handling could also be implemented.

Aliases for `Error` and `Result` have been added to the `bevy_ecs`
prelude, as well as const `OK` which new users may find more friendly
than `Ok(())`.

## Testing

- Currently there are not actual semantics changes that really require
new tests, but I added a basic one just to make sure we don't break
stuff in the future.
- The behavior of existing systems is totally unchanged, including
logging.
- All of the existing systems tests pass, and I have not noticed
anything strange while playing with the examples

## Showcase

The following minimal example prints "hello world" once, then completes.

```rust
use bevy::prelude::*;

fn main() {
    App::new().add_systems(Update, hello_world_system).run();
}

fn hello_world_system() -> Result {
    println!("hello world");
    Err("string")?;
    println!("goodbye world");
    OK
}
```

## Migration Guide

This change should be pretty much non-breaking, except for users who
have implemented their own custom executors. Those users should use
`ScheduleSystem` in place of `BoxedSystem<(), ()>` and import the
`System` trait where needed. They can choose to do whatever they wish
with the result.

## Current Work

+ [x] Fix tests & doc comments
+ [x] Write more tests
+ [x] Add examples
+ [X] Draft release notes

## Draft Release Notes

As of this release, systems can now return results.

First a bit of background: Bevy has hisotrically expected systems to
return the empty type `()`. While this makes sense in the context of the
ecs, it's at odds with how error handling is typically done in rust:
returning `Result::Error` to indicate failure, and using the
short-circuiting `?` operator to propagate that error up the call stack
to where it can be properly handled. Users of functional languages will
tell you this is called "monadic error handling".

Not being able to return `Results` from systems left bevy users with a
quandry. They could add custom error handling logic to every system, or
manually pipe every system into an error handler, or perhaps sidestep
the issue with some combination of fallible assignents, logging, macros,
and early returns. Often, users would just litter their systems with
unwraps and possible panics.

While any one of these approaches might be fine for a particular user,
each of them has their own drawbacks, and none makes good use of the
language. Serious issues could also arrise when two different crates
used by the same project made different choices about error handling.

Now, by returning results, systems can defer error handling to the
application itself. It looks like this:

```rust
// Previous, handling internally
app.add_systems(my_system)
fn my_system(window: Query<&Window>) {
   let Ok(window) = query.get_single() else {
       return;
   };
   // ... do something to the window here
}

// Previous, handling externally
app.add_systems(my_system.pipe(my_error_handler))
fn my_system(window: Query<&Window>) -> Result<(), impl Error> {
   let window = query.get_single()?;
   // ... do something to the window here
   Ok(())
}

// Previous, panicking
app.add_systems(my_system)
fn my_system(window: Query<&Window>) {
   let window = query.single();
   // ... do something to the window here
}

// Now 
app.add_systems(my_system)
fn my_system(window: Query<&Window>) -> Result {
    let window = query.get_single()?;
    // ... do something to the window here
    Ok(())
}
```

There are currently some limitations. Systems must either return `()` or
`Result<(), Box<dyn Error + Send + Sync + 'static>>`, with no
in-between. Results are also ignored by default, and though implementing
a custom handler is possible, it involves writing your own custom ecs
executor (which is *not* recomended).

Systems should return errors when they cannot perform their normal
behavior. In turn, errors returned to the executor while running the
schedule will (eventually) be treated as unexpected. Users and library
authors should prefer to return errors for anything that disrupts the
normal expected behavior of a system, and should only handle expected
cases internally.

We have big plans for improving error handling further:
+ Allowing users to change the error handling logic of the default
executors.
+ Adding source tracking and optional backtraces to errors.
+ Possibly adding tracing-levels (Error/Warn/Info/Debug/Trace) to
errors.
+ Generally making the default error logging more helpful and
inteligent.
+ Adding monadic system combininators for fallible systems.
+ Possibly removing all panicking variants from our api.

---------

Co-authored-by: Zachary Harrold <[email protected]>
ecoskey pushed a commit to ecoskey/bevy that referenced this pull request Jan 6, 2025
# Objective

- First step for bevyengine#16718 
- bevyengine#16589 introduced an api that can only ignore errors, which is risky

## Solution

- Panic instead of just ignoring the errors

## Testing

- Changed the `fallible_systems` example to return an error
```
Encountered an error in system `fallible_systems::setup`: TooManyVertices { subdivisions: 300, number_of_resulting_points: 906012 }
Encountered a panic in system `fallible_systems::setup`!
Encountered a panic in system `bevy_app::main_schedule::Main::run_main`!
```
ecoskey pushed a commit to ecoskey/bevy that referenced this pull request Jan 6, 2025
# Objective

- bevyengine#16589 added an enum to switch between fallible and infallible system.
This branching should be unnecessary if we wrap infallible systems in a
function to return `Ok(())`.

## Solution

- Create a wrapper system for `System<(), ()>`s that returns `Ok` on the
call to `run` and `run_unsafe`. The wrapper should compile out, but I
haven't checked.
- I removed the `impl IntoSystemConfigs for BoxedSystem<(), ()>` as I
couldn't figure out a way to keep the impl without double boxing.

## Testing

- ran `many_foxes` example to check if it still runs.

## Migration Guide

- `IntoSystemConfigs` has been removed for `BoxedSystem<(), ()>`. Either
use `InfallibleSystemWrapper` before boxing or make your system return
`bevy::ecs::prelude::Result`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ECS Entities, components, systems, and events C-Feature A new feature, making something new possible C-Usability A targeted quality-of-life change that makes Bevy easier to use M-Needs-Migration-Guide A breaking change to Bevy's public API that needs to be noted in a migration guide M-Needs-Release-Note Work that should be called out in the blog due to impact S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it X-Contentious There are nontrivial implications that should be thought through
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants