Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decide what to do when on task double-failure #910

Closed
brson opened this issue Sep 12, 2011 · 17 comments · Fixed by #11283
Closed

Decide what to do when on task double-failure #910

brson opened this issue Sep 12, 2011 · 17 comments · Fixed by #11283
Labels
A-destructors Area: Destructors (`Drop`, …) A-runtime Area: std's runtime and "pre-main" init for handling backtraces, unwinds, stack overflows A-type-system Area: Type system
Milestone

Comments

@brson
Copy link
Contributor

brson commented Sep 12, 2011

Currently there are no landing pads generated for resource destructors so they are leaky when they fail. We don't have the same tricky situation that C++ has with throwing destructors but I haven't thought through how it should work yet.

Edit: This issue has changed into what to do after a dtor fails during unwinding. The previous issue of dtor failure not working at all has been resolved.

@brson
Copy link
Contributor Author

brson commented Sep 12, 2011

There's a failing test in run-fail/unwind-resource-fail.rs

@brson
Copy link
Contributor Author

brson commented Sep 14, 2011

And another in run-fail/unwind-resource-fail2.rs

jruderman added a commit that referenced this issue Sep 21, 2011
@brson
Copy link
Contributor Author

brson commented Nov 15, 2011

The language reference is actually quite explicit about this:

"A task may transition to the failing state at any time, due to an un-trapped signal or the evaluation of a fail expression. Once failing, a task unwinds its stack and transitions to the dead state. Unwinding the stack of a task is done by the task itself, on its own control stack. If a value with a destructor is freed during unwinding, the code for the destructor is run, also on the task's control stack. Running the destructor code causes a temporary transition to a running state, and allows the destructor code to cause any subsequent state transitions. The original task of unwinding and failing thereby may suspend temporarily, and may involve (recursive) unwinding of the stack of a failed destructor. Nonetheless, the outermost unwinding activity will continue until the stack is unwound and the task transitions to the dead state. There is no way to “recover” from task failure. Once a task has temporarily suspended its unwinding in the failing state, failure occurring from within this destructor results in hard failure. The unwinding procedure of hard failure frees resources but does not execute destructors. The original (soft) failure is still resumed at the point where it was temporarily suspended. "

@graydon
Copy link
Contributor

graydon commented Jul 26, 2012

Yeah. The hard/soft thing has never been implemented, as far as I know, but I think it's the best we can do: retain correctness of our memory model and try to retain correctness of user code, but not at the cost of infinite-recursion. If their dtor itself fails, just unwind that sub-failure w/o the opportunity for sub-sub-failures.

@pnkfelix
Copy link
Member

�It would be good to revise the title of this bug to reflect what the task is. (From the comments thus far, it sounds like the task is to implement the hard/sort failure unwindin protocol; but it might also be useful to clean up the docs here.)

@pnkfelix
Copy link
Member

Not critical for 0.6; de-milestoning

@bblum
Copy link
Contributor

bblum commented Jun 5, 2013

confirmed again; current code that can crash is as follows:

struct Foo { x: ~int, }

impl Drop for Foo {
    fn finalize(&self) { fail!(); }
}

fn main() {
    let _x = Foo { x: ~5, };
    fail!();
}

If my idea for an effect system goes through, we might make this sound again by simply requiring no failure in destructors, e.g. trait Drop { #[wont(Fail)] fn finalize(...); }.

@bblum
Copy link
Contributor

bblum commented Jun 11, 2013

nominating for well-defined milestone

@graydon
Copy link
Contributor

graydon commented Jun 13, 2013

there is language about this in the manual already in the task section, so I think it's well defined already. just not implemented. feature completeness issue.

@graydon
Copy link
Contributor

graydon commented Jun 13, 2013

accepted for feature-complete milestone

@huonw
Copy link
Member

huonw commented Sep 10, 2013

Triage: the updated version of @bblum's example still crashes:

struct Foo { x: ~int, }

impl Drop for Foo {
    fn drop(&self) { fail!(); }
}

fn main() {
    let _x = Foo { x: ~5, };
    fail!();
}
task <unnamed> failed at 'explicit failure', 910.rs:9
task <unnamed> failed at 'explicit failure', 910.rs:4


The ocean ate the last of the land and poured into the smoking gulf, thereby
giving up all it had ever conquered. From the new-flooded lands it flowed
again, uncovering death and decay; and from its ancient and immemorial bed it
trickled loathsomely, uncovering nighted secrets of the years when Time was
young and the gods unborn. Above the waves rose weedy remembered spires. The
moon laid pale lilies of light on dead London, and Paris stood up from its damp
grave to be sanctified with star-dust. Then rose spires and monoliths that were
weedy but not remembered; terrible spires and monoliths of lands that men never
knew were lands...

fatal runtime error: unwinding again

@catamorphism
Copy link
Contributor

1.0 backcompat

@alexcrichton
Copy link
Member

This not only applies to destructors failing while failing, but this appears to also apply to linked failure as well. A program may only contain one fail!() statement, but it may still leak due to linked failure. An example of this would be:

use std::rt::io::timer;

struct A {
  b: B,
}

struct B {
  foo: int,
}

impl Drop for A {
  fn drop(&mut self) {
    timer::sleep(50);
  }
}

impl Drop for B {
  fn drop(&mut self) {
    println!("dropping b\n");
  }
}

fn main() {
  do spawn {
    let _a = A { b: B { foo: 3 } };
  }
  fail!()
}

The destructor for B never runs in this program because the call to timer::sleep invokes scheduler business which will realize that it needs to fail due to linked failure. It seems odd to me that if a task only contains one source of failure then we can still leak.

This specific use case may not entirely fall under this issue, but it seems fairly serious.

@bblum
Copy link
Contributor

bblum commented Oct 25, 2013

The scheduler does have logic for not failing from linked failure when a task is already failing. So if, for example, the sleeping destructor here were to happen during unwinding, I think it should be fine. But if it happens normally you could still get failure.

@brson
Copy link
Contributor Author

brson commented Jan 3, 2014

I've updated the title to reflect what we're currently talking about.

My understanding is that it is just impossible to throw from a landing pad because it will potentially corrupt the unwinder and our three options are:

  • abort - this is what C++ does
  • resume unwinding without completing the landing pad - this is what Ada does and we haven't discussed this option
  • abort the task without completing unwinding - this is "hard" failure we've talked about

The "hard" failure is made more difficult by tasks that are not simply green threads. For example, we have use cases where we want to run in a 'task context', on the current thread, and then be told whether the task succeeded or failed, all on the same thread. There's no way in that scenario to just terminate the task cleanly.

My current opinion is that, for 1.0 we should take the most conservative approach and abort. For future versions we should consider trying to do what Ada does and skip a single landing pad, but still proceed to catch the exception.

brson added a commit to brson/rust that referenced this issue Jan 3, 2014
Previously this was an rtabort!, indicating a runtime bug. Promote
this to a more intentional abort and print a (slightly) more
informative error message.

Can't test this sense our test suite can't handle an abort exit.
@bblum
Copy link
Contributor

bblum commented Jan 3, 2014

I agree with double-abort for 1.0.

Thinking about the other two schemes in terms of how RWARCs (or whatever they're called these days) behave, aborting just a single task strikes me as risky in terms of unexpected behaviour befalling the user. If a double-failure occurs inside an access to a RWARC, it won't be automatically unlocked during unwinding, and other contending tasks will block forever.

I think that resuming the unwinding and skipping the one landing pad still makes sense, though. If double-failure occurs in a user-provided destructor (or in the destructor of some buggy library), skipping the rest of the destructor can only "surprise" other code directly associated with the faulty destructor.

@emberian
Copy link
Member

emberian commented Jan 4, 2014

I agree too.

@bors bors closed this as completed in 239fb1f Jan 4, 2014
ZuseZ4 pushed a commit to EnzymeAD/rust that referenced this issue Mar 7, 2023
* Add memref handling for fwd mode

* Add simple llvm dialect

* Update enzyme/Enzyme/MLIR/Implementations/LLVMAutoDiffOpInterfaceImpl.cpp

Co-authored-by: Tim Gymnich <[email protected]>

* fixup

Co-authored-by: Tim Gymnich <[email protected]>
coastalwhite pushed a commit to coastalwhite/rust that referenced this issue Aug 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-destructors Area: Destructors (`Drop`, …) A-runtime Area: std's runtime and "pre-main" init for handling backtraces, unwinds, stack overflows A-type-system Area: Type system
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants