-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Decide what to do when on task double-failure #910
Comments
There's a failing test in run-fail/unwind-resource-fail.rs |
And another in run-fail/unwind-resource-fail2.rs |
The language reference is actually quite explicit about this: "A task may transition to the failing state at any time, due to an un-trapped signal or the evaluation of a fail expression. Once failing, a task unwinds its stack and transitions to the dead state. Unwinding the stack of a task is done by the task itself, on its own control stack. If a value with a destructor is freed during unwinding, the code for the destructor is run, also on the task's control stack. Running the destructor code causes a temporary transition to a running state, and allows the destructor code to cause any subsequent state transitions. The original task of unwinding and failing thereby may suspend temporarily, and may involve (recursive) unwinding of the stack of a failed destructor. Nonetheless, the outermost unwinding activity will continue until the stack is unwound and the task transitions to the dead state. There is no way to “recover” from task failure. Once a task has temporarily suspended its unwinding in the failing state, failure occurring from within this destructor results in hard failure. The unwinding procedure of hard failure frees resources but does not execute destructors. The original (soft) failure is still resumed at the point where it was temporarily suspended. " |
Yeah. The hard/soft thing has never been implemented, as far as I know, but I think it's the best we can do: retain correctness of our memory model and try to retain correctness of user code, but not at the cost of infinite-recursion. If their dtor itself fails, just unwind that sub-failure w/o the opportunity for sub-sub-failures. |
�It would be good to revise the title of this bug to reflect what the task is. (From the comments thus far, it sounds like the task is to implement the hard/sort failure unwindin protocol; but it might also be useful to clean up the docs here.) |
Not critical for 0.6; de-milestoning |
confirmed again; current code that can crash is as follows:
If my idea for an effect system goes through, we might make this sound again by simply requiring no failure in destructors, e.g. |
nominating for well-defined milestone |
there is language about this in the manual already in the task section, so I think it's well defined already. just not implemented. feature completeness issue. |
accepted for feature-complete milestone |
Triage: the updated version of @bblum's example still crashes: struct Foo { x: ~int, }
impl Drop for Foo {
fn drop(&self) { fail!(); }
}
fn main() {
let _x = Foo { x: ~5, };
fail!();
}
|
1.0 backcompat |
This not only applies to destructors failing while failing, but this appears to also apply to linked failure as well. A program may only contain one use std::rt::io::timer;
struct A {
b: B,
}
struct B {
foo: int,
}
impl Drop for A {
fn drop(&mut self) {
timer::sleep(50);
}
}
impl Drop for B {
fn drop(&mut self) {
println!("dropping b\n");
}
}
fn main() {
do spawn {
let _a = A { b: B { foo: 3 } };
}
fail!()
} The destructor for This specific use case may not entirely fall under this issue, but it seems fairly serious. |
The scheduler does have logic for not failing from linked failure when a task is already failing. So if, for example, the sleeping destructor here were to happen during unwinding, I think it should be fine. But if it happens normally you could still get failure. |
I've updated the title to reflect what we're currently talking about. My understanding is that it is just impossible to throw from a landing pad because it will potentially corrupt the unwinder and our three options are:
The "hard" failure is made more difficult by tasks that are not simply green threads. For example, we have use cases where we want to run in a 'task context', on the current thread, and then be told whether the task succeeded or failed, all on the same thread. There's no way in that scenario to just terminate the task cleanly. My current opinion is that, for 1.0 we should take the most conservative approach and abort. For future versions we should consider trying to do what Ada does and skip a single landing pad, but still proceed to catch the exception. |
Previously this was an rtabort!, indicating a runtime bug. Promote this to a more intentional abort and print a (slightly) more informative error message. Can't test this sense our test suite can't handle an abort exit.
I agree with double-abort for 1.0. Thinking about the other two schemes in terms of how RWARCs (or whatever they're called these days) behave, aborting just a single task strikes me as risky in terms of unexpected behaviour befalling the user. If a double-failure occurs inside an access to a RWARC, it won't be automatically unlocked during unwinding, and other contending tasks will block forever. I think that resuming the unwinding and skipping the one landing pad still makes sense, though. If double-failure occurs in a user-provided destructor (or in the destructor of some buggy library), skipping the rest of the destructor can only "surprise" other code directly associated with the faulty destructor. |
I agree too. |
* Add memref handling for fwd mode * Add simple llvm dialect * Update enzyme/Enzyme/MLIR/Implementations/LLVMAutoDiffOpInterfaceImpl.cpp Co-authored-by: Tim Gymnich <[email protected]> * fixup Co-authored-by: Tim Gymnich <[email protected]>
Currently there are no landing pads generated for resource destructors so they are leaky when they fail. We don't have the same tricky situation that C++ has with throwing destructors but I haven't thought through how it should work yet.
Edit: This issue has changed into what to do after a dtor fails during unwinding. The previous issue of dtor failure not working at all has been resolved.
The text was updated successfully, but these errors were encountered: