Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interruptible or uninterruptible cleanup #3

Closed
snoyberg opened this issue Jun 22, 2016 · 52 comments
Closed

Interruptible or uninterruptible cleanup #3

snoyberg opened this issue Jun 22, 2016 · 52 comments

Comments

@snoyberg
Copy link
Member

Let's have a dedicated issue for this discussion. Pinging @Peaker and @Yuras

@Yuras
Copy link

Yuras commented Jun 23, 2016

I agree that this issue is orthogonal to #2.

I'll replicate and extend some of my arguments here. Basically we have two issues:

  1. Is it allowed for cleanup action be interruptible or not.
  2. Should bracket, catch, etc use uninterruptibleMask_ or it is a responsibility of cleanup action author.

Re 1. People often claim that it is a bug if cleanup action is interruptible. And I'd probably agree. Unfortunately it was never documented as a requirement. Even worse, hClose, one of the most widely used cleanup actions, is interruptible already.

If we decide that cleanup actions should not be interruptible, than we should convince CLC to accept it as a policy and document it in Control.Exception, otherwise half of the community will continue writing interruptible cleanups, and half of it will write code assuming cleanups are not interruptible (and both halves will reject patches from other half because of different understanding of exception handling). Then we'll be able to raise a bug against hClose and make it uninterruptible. And it means that hClose (or its new variant specially crafted to be used in cleanups) should not flush buffers.

If we allow interruptible cleanups, then uninterruptibleMask_ is bracket is unsafe because it may result in blocking for an unbound amount of time. Though in the first call to after (in case of exception from thing) uninterruptibleMask_ is probably a necessary evil because we have two exceptions of hands without a clue what to do with them. More sound solution exists, but it is a bit more complicated and probably requires support from RTS. Anyway uninterruptibleMask_ is not strictly necessary in the second call to after (in case thing succeeded).

Re 2. Here we speak about uninterruptibleMask_ surrounding after in both case, regardless whether thing failed or not. The argument for uninterruptibleMask_ is the next: we need it in every cleanup anyway, and people often forget to write it, so lets do it in the library and forget about the issue completely.

But as I already showed in the original issue, uninterruptibleMask_ is necessary much rarer then people usually claim. It is used in very specific situations where author should already do a lot of hard work eliminating sync exceptions. In this case typing uninterruptibleMask_ is not a big deal, it is even useful to indicate that "I thought hard about it, and I'm sure this action can't throw sync exception, and I rely on no-throw safety level, so I want to dissable async exceptions too" or that "this interruptible action is not strictly necessary here, but I want it for some reason (e.g. logging); I ensured the resource will be released if this action fails, so I disable async exceptions here only because the policy requires that".

Other common argument is: people are not educated enough to handle async exceptions correctly in cleanups, so lets just disable them at all. But we can't disable them in alloc actions, so we have to educate people anyway or remove async exceptions from the language completely. I don't believe that people will write sync- or async-exception-safe code by incident, without understanding how they work. And uninterruptibleMask_ doesn't help here a lot because it is rarely necessary in practice, and where it is necessary it is much harder to eliminate sync exceptions then disable async onces.

In any case, it is less important whether bracket uses uninterruptibleMask_. It is more important to agree on the policy and state that cleanup authors should write correct code and don't rely on bracket to disable async exceptions.

@ndmitchell
Copy link

I find disabling asynchronous exceptions in cleanup distasteful. Consider all normal code and library routines - they must be async safe and can take unlimited time. Consider exception handlers - should they use the same principles? Or an entirely different set? That convinced me that mask is a better option.

@Peaker
Copy link

Peaker commented Jun 29, 2016

The cleanup can be completely async safe which just means that its side
effects are successfully undone in case of exception.

But for the cleanup context, you do not want the cleanup undone, ever.

On Wednesday, 29 June 2016, Neil Mitchell [email protected] wrote:

I find disabling asynchronous exceptions in cleanup distasteful. Consider
all normal code and library routines - they must be async safe and can take
unlimited time. Consider exception handlers - should they use the same
principles? Or an entirely different set? That convinced me that mask is a
better option.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#3 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AAC_A_J7zlewpM_K8jxGUvGajn33nW05ks5qQgtngaJpZM4I8KmA
.

Sent from Gmail Mobile

@snoyberg
Copy link
Member Author

Let me try and put in a summary of what's agreed upon and not agreed upon, then others can correct me:

  • If we have bracket (openFile ...) hClose something, we want a guarantee that, no matter what happens, when control is returned outside of the bracket call the file handle is closed (if in fact the openFile succeeded)
  • By the nature of async exceptions, there is no way to provide this guarantee without masking both the openFile and hClose calls from within bracket
  • Masking interruptible exceptions is a different story: it's absolutely possible for our bracket call above to meet the goals we set out for it with just a mask call (and not an uninterruptibleMask call) in place, since the hClose call has control over which operations may trigger such an exception and can take appropriate actions itself
  • Therefore, the question here comes down to one of responsibility:
    • Do we place responsibility on cleanup handlers like hClose to be safe in the face of an interruptible masking state? In practice, we know that many such handlers are not written this way today, and therefore fixing this situation is error prone
    • Instead, do we place responsibility on the bracket function to instead use maskUninterruptible for cleanup handlers, removing responsibility from these cleanup handlers? The downside is that, if there is a bug in these cleanup handlers, there will be no way to kill these handlers, leading to the possibility of deadlock

Have I accurately explained the goals and different options here?

@Peaker
Copy link

Peaker commented Jun 30, 2016

@snoyberg Here are my comments:

"interruptible exceptions" => "interruptible actions".

I agree with points 1-2.

Points 3-4:.
Let me use a different example than hClose:

Here are 2 valid and sensible brackets for a semaphore:
bracket (waitTokens n) (postTokens n)
bracket (postTokens n) (waitTokens n)

In both of these, we really want the alloc-action to be interruptible (or unnecessarily lose responsiveness and even hit deadlocks), and the cleanup action to be uninterruptible (or our semaphore invariants get broken!).

Neither of these (waitTokens, postTokens) intrinsically require uninterruptibleMask.
Both of these ought to be uninterruptibleMaskd in the context of bracket cleanup.

So the question isn't whether the responsibility for the uninterruptibleMask should lie with bracket or with hClose. But whether it should lie with bracket or with the code that directly invokes bracket (and uses both bracket and hClose).

Since the vast majority of bracket invocations will want to uninterruptibleMask their cleanup (and will generally not want the IO action underlying the cleanup action to be uninterruptible), some bracket combinator that uses uninterruptibleMask on the cleanup must exist. Then the justification for the (very rarely useful) current bracket to exist disappears.

One final point: The word "async-safe" is overloaded here and I think is causing some confusion.

For something to be declared "async-safe", it needs to undo its side-effects successfully (or otherwise guarantee no visible effects were leaked) upon an async exception.
This safety is not what you want from a cleanup handler in bracket, since the side-effects of a cleanup handler must happen for the entire bracket to be async-safe.

uninterruptibleMask on the cleanup is not meant to make the cleanup async-safe, it is already async-safe.
uninterruptibleMask on the cleanup is for the async-safety of the composed bracket, which does not inherit its safety just from the async-safety of its components.

@ndmitchell
Copy link

My current understanding is that the only way to have a "safe" bracket (with the guarantee that all operations will be rolled back upon completion) is to mask all async exceptions on both the acquire and release. My gut feeling is that writing code that blocks async exceptions means we lose many of the benefits of async exceptions - an openFile might talk to a network mounted NFS drive and take 30s or so in the worst case.

However, this library seems to be about making sensible compromises to make the current stuff work nicely, rather than redefining policy. As such, blocking async exceptions might be the right thing to do here, even though it still leaves me feeling icky.

@Peaker
Copy link

Peaker commented Jun 30, 2016

@ndmitchell Uninterruptible-masking acquire is not required for full safety.

The acquire action itself just has to be async-safe, i.e: it will internally roll back as needed upon an async exception. The reason interruptible masking is needed for acquire is that an async exception can happen between the completion of the acquire and the installation of our exception handler for cleanup. Ordinary mask covers that part as it has no interruption points.

Uninterruptible masking is only needed on the release side.

@ndmitchell
Copy link

@Peaker I feel I'm out of my depth here, you may well be right, I think I'll just lurk and observe this thread :)

@Yuras
Copy link

Yuras commented Jun 30, 2016

@Peaker I think you "final note" is just about cleanup actions be different from all other actions. Indeed it is true -- other actions should usually minimize side effects of failure, while cleanups should release resources in any case. But it contradicts with the first part of your comment. If cleanups are different, then you can't use them as acquire actions or vice versa.

For example, postToken could look like the next:

postToken :: Token -> IO ()
postToken t = do
  hPutStrLn "will post token"
  doPostToken t
  hPutStrLn "did post token"

This implementations is invalid both as acquire and cleanup action (but could be OK as a regular action). To be used as an acquire action, it should guarantee that the token will not be posted in case of failure:

postToken :: Token -> IO ()
postToken t = do
  hPutStrLn "will post token"
  doPostToken t
  hPutStrLn "did post token"
    `onException` doWaitToken  -- it is not well defined what we should do in case of exception here...

To be used as a cleanup action, it should guarantee the token will be posted in case of failure:

postToken :: Token -> IO ()
postToken t = do
  hPutStrLn "will post token"
    `finally` doPostToken t
  hPutStrLn "did post token"

So, while sometimes an action could be used both as a acquire and cleanup action, it is not the case in general. And you can't convert acquire action into cleanup action using uninterruptibleMask_.

@Peaker
Copy link

Peaker commented Jun 30, 2016

I don't think (putStrLn "..." >> acquire >> putStrLn "...") with no masking is ever a valid action (even "regular" action), because it is not exception-safe in any way. You need the release in case of exception to make it exception-safe(sync&async), which you want in acquire actions in an "regular" actions.

That reduces us to regular actions vs. cleanup actions.
I see what you're trying to say about sync exception handling in cleanup context.
But you want to make sure your bracket cleanups do NOT have any sync exceptions at all, which kind of makes the point moot. Indeed you cannot freely call putStrLn inside actions used in cleanup context.

In addition and independently of avoiding sync exceptions, you never want to handle async exceptions in the cleanup context, so bracket should guarantee that.

Ideally, Haskell would have some phantom type parameter to IO indicating impossibility of sync exceptions, so you could know your cleanup action is safe from sync exceptions.
Unfortunately, we do not have that.
Fortunately, however, we can be safe from async exceptions at the very least.

@Yuras
Copy link

Yuras commented Jun 30, 2016

I don't think (putStrLn "..." >> acquire >> putStrLn "...") with no masking is ever a valid action (even "regular" action), because it is not exception-safe in any way.

Not necessary. Depending on a context, the same thing may or may not be a "resource". E.g. if you failed to acquire/release connection from/to a pool, then you can just drain the pool itself -- no resource leak. In that case we have "basic" safety level. Other example is QSem -- if you are not sure whether a slot was allocated or not, then just drop the semaphore (the later example we discussed with Simon Marlow -- he pointed me to the fact that waitQSem could be useful outside mask.)

I see what you're trying to say about sync exception handling in cleanup context.

I don't see any difference between sync and async exceptions here. In my postToken examples I handler sync and async exceptions in the same way.

But you want to make sure your bracket cleanups do NOT have any sync exceptions at all

Hmm, we may require cleanups not to throw sync exceptions by swallowing them. But we can't make sure exceptions never happen (except few cases where we can achieve "no-throw" safety level.) There is no way to ensure hPutStr doesn't fail.

@Peaker
Copy link

Peaker commented Jun 30, 2016

I don't see any difference between sync and async exceptions here. In my postToken examples I handler sync and async exceptions in the same way.

The way you handle them is only good for sync exceptions. If putStrLn throws any exception (Sync or async), then doPostToken in your finally block may be hit by another async exception, so it still needs an uninterruptibleMask there.

Hmm, we may require cleanups not to throw sync exceptions by swallowing them. But we can't make sure exceptions never happen (except few cases where we can achieve "no-throw" safety level.)

It is not so few -- you can build a whole world of functionality around MVars, TVars, atomicModifyIORef, and pure code in a deadlock-free way giving quite a bit of useful nothrow code.

But otherwise, I agree that you may often need to swallow exceptions in cleanup handlers.

@Yuras
Copy link

Yuras commented Jun 30, 2016

The way you handle them is only good for sync exceptions. If putStrLn throws any exception (Sync or async), then doPostToken in your finally block may be hit by another async exception, so it still needs an uninterruptibleMask there.

No, doPostToken here is assumed to be a "correct" cleanup -- it releases the resource even if it fails, either with sync or async exception. If it is just a regular action, then I need to ensure it may not fail (and use uninterruptibleMask_ if necessary).

It is not so few -- you can build a whole world of functionality around MVars, TVars, atomicModifyIORef, and pure code in a deadlock-free way giving quite a bit of useful nothrow code.

Right, you can build a lot of no-throw code. But you can't do it incidentally. It is hard to achieve no-throw. And only after you did the hard work, it makes sense to do the last easy step -- add uninterruptibleMask_. But if the code can fail with sync exception, then uninterruplibleMask_ is useless, it hides the issue instead of fixing it. And in my experience, in most cases uninterruplibleMask will not help because most of cleanup actions in haskell code are not sync-safe anyway.

@snoyberg I'm not sure this discussion is useful for you. We discussed it with @Peaker a number of times already, we are just repeating the same arguments. But I think the most fundamental thing is to decide how correct cleanup should behave: is it allowed to it to be interruptible, may it fail at all, should it release resources in case of sync or async failure.

@ndmitchell
Copy link

Might I suggest than @snoyberg does whatever is done in the base library, and should anyone disagree with that approach, leave it for them to convince the base library to fix itself first, and then have this package follow along.

@Peaker
Copy link

Peaker commented Jun 30, 2016

But if the code can fail with sync exception, then uninterruplibleMask_ is useless, it hides the issue instead of fixing it.

This is the crux of the disagreement, I believe.

I believe each different circumstance for which the action leaks a resource is a separate bug.

For example, the above code with putStrLn may have 2 actual bugs in realstic configurations (where stdout write only blocks but never fails).

A) Leaks when stdout was/is closed
B) Leaks upon async exceptions (non-determinstic race)

You claim that fixing B is useless because you've not also fixed A.

I claim fixing B is a huge benefit, even if A was not yet fixed.

More-over, sync exceptions during cleanup are an obvious concern. Virtually any Haskeller would understand the need to do it.
Async exceptions during cleanup are far more subtle (as seen in GHC code, where multiple brackets were buggy due to async exceptions, but were otherwise nothrow).

uninterruptibleMask for cleanup by default means:
A) the subtle issues go away -- leaving synchronous exceptions, a much less subtle issue to resolve

B) even if you're aware of the need to uninterruptibly mask your cleanups -- it saves you from the need of doing it for every single cleanup action in existence

C) It brings the program to a less buggy state. A program is not buggy / not-buggy. It can be more or less buggy.

@Peaker
Copy link

Peaker commented Jun 30, 2016

@ndmitchell In the mailing list, I think besides @Yuras there was a strong consenesus that bracket with uninterruptibleMask in cleanups is far preferable -- with few voicing concern about breaking legacy code that assumes the current broken behavior.

Fixing base is hard and has backwards compatibility concerns.

A new library copying existing broken semantics makes the inevitable fix all the more painful later.

@Yuras
Copy link

Yuras commented Jun 30, 2016

This is the crux of the disagreement, I believe.

Yes.

I believe each different circumstance for which the action leaks a resource is a separate bug.

IMO if the cause is the same, then it is one bug. Otherwise they are separate.

For example, the above code with putStrLn may have 2 actual bugs

Both of them require the same fix. The fix is specific neither to sync no async exceptions. uninterruptibleMask_ is not a fix. And the same fix applies to similar code in C++ where there is no async exceptions. Hmm.

it saves you from the need of doing it for every single cleanup action in existence

It is a huge overestimate.

It brings the program to a less buggy state.

What is less buggy, this #define TRUE (rand() > 0.1 ? TRUE : FALSE) or this #define TRUE (rand() > 0.01 ? TRUE : FALSE)? The later probably is less buggy. But what if we just remove the silly define?

Most of bugs in cleanups are not async-specific IME, so most of bugs about async exceptions indicate a bug with sync exceptions. The "fix" you are proposing (uninterruptibleMask) is similar to changing the number in the define above.

@Peaker
Copy link

Peaker commented Jun 30, 2016

Both of them require the same fix.

One requires slapping on uninterruptibleMask. The other requires catching sync exceptions and swallowing them or retrying.

How are they the same?

What is less buggy, this #define TRUE (rand() > 0.1 ? TRUE : FALSE) or this #define TRUE (rand() > 0.01 ? TRUE : FALSE)? The later probably is less buggy. But what if we just remove the silly define?

That is a strawman argument. Of course if you can "just fix it", then do. But any fix will involve uninterruptibleMask in any case.

There is a lot of software that is correct up to events of 2^-160. Is it just as incorrect as any buggy software?

Most of bugs in cleanups are not async-specific IME

How are they "sync fixes" even related to the async fix which is uninterruptibleMask (either at the bracket or at the use site)?

The "fix" you are proposing (uninterruptibleMask) is similar to changing the number in the define above.

No, it is more akin to:

#define true (A && B)

where both A and B are supposed to be true but may be buggy(false). I convert it to: #define true A
And now only A has to be clean. And in many cases, it is clean (nothrow!).

@Yuras
Copy link

Yuras commented Jun 30, 2016

One requires slapping on uninterruptibleMask. The other requires catching sync exceptions and swallowing them or retrying.

Both require just a finally:

postToken t = do
  hPutStrLn "will post token"
    `finally` doPostToken t
  hPutStrLn "did post token"

@ElastiLotem
Copy link

The finally is only needed because of the silly prints.
Independently you need the uninterruptible masking, which you claim is inside the function. I wouldn't want my print in a cleanup not to happen due to async exception, as it's supposed to be guaranteed to happen. So you would want the bracket to add it even if you had a weird uninterruptible mask inside.

@Yuras
Copy link

Yuras commented Jul 2, 2016

Independently you need the uninterruptible masking, which you claim is inside the function.

I don't understand what you mean here. I didn't say anything about uninterruptibleMask_ here at all.

Let me summarize.

  1. uninterruptibleMask_ around cleanup action in bracket is necessary for correctness, but only when the body failed. Otherwise you simply can't write correct code right now (assuming interruptible cleanups are allowed) because there is no correct way to handle async exceptions from body and cleanup at the same time.
  2. Otherwise uniterruptibleMask_ is not necessary for correctness (i.e. you can write correct code if you put it were necessary by hands.) It could be necessary because people are too careless/uninformed/stupid to write correct cleanups, but this argument is too opinionated and there is no way to prove or disprove it. All arguments are provided, so let @snoyberg make a decision.

@Peaker
Copy link

Peaker commented Jul 2, 2016

  1. I don't understand you either.
    Interruptible cleanups is (virtually?) always a bad idea. I have not yet seen an example where that is desirable. So there should be no issue of handling async exceptions in cleanups at all.
  2. Of course it is necessary for correctness. Up in your finally doPostToken example, you relied on uninterruptibleMask existing inside doPostToken or at the bracket. But something must absolutely use uninterruptibleMask (in addition to general exception primitives to handle exceptions safely) for the cleanup to guarantee the release of the resources.

@Yuras
Copy link

Yuras commented Jul 2, 2016

You are arguing with someone else it seems.

  1. I explicitly wrote "assuming interruptible cleanups are allowed". If you disallow them, then uninterruptibleMask_ in bracket is not necessary for correctness. Again, for correctness. I'm not saying anything about other reasons.
  2. I'm talking about uninterruplibleMask_ in bracket. doPostToken may or may not need uninterruptibleMask_ internally, I don't care. We are talking about bracket here.

But something must absolutely use uninterruptibleMask (in addition to general exception primitives to handle exceptions safely) for the cleanup to guarantee the release of the resources.

It is wrong. withMVar is an example of a correct usage of cleanup action without uninterruptibleMask_.

@Peaker
Copy link

Peaker commented Jul 2, 2016

If you disallow them you do it via uninterruptible mask.

withMVar needs uninterruptible mask to be implemented correctly, so I'm not sure what you mean.

@Yuras
Copy link

Yuras commented Jul 2, 2016

If you disallow them you do it via uninterruptible mask.

I don't care. If you allow cleanup to be interrupted, then you need uninterruptibleMask_ in bracket for correctness. Otherwise you don't. We are talking about bracket here.

withMVar needs uninterruptible mask to be implemented correctly, so I'm not sure what you mean.

You know that I know that you know what I mean. withMVar uses putMVar as a cleanup, and it is uninterruptible already because the MVar is empty (ignoring that someone can putMVar in the middle, but it is not the case for most use cases. You are going to attack this corner case, but we discussed it already multiple times, no need to repeat. This corner case is documented.)

@Peaker
Copy link

Peaker commented Jul 2, 2016

You're implying malintent, but it's not there, I really don't understand
your line of reasoning here and don't remember every detail from previous
arguments.

I'm quite lost about the first point, it's like we speak different
languages.

For the latter point, I disagree that:
A) it's not necessarily a corner case in all scenarios
B) relying on the inherent uninterruptibility of actions is unnecessary
brittleness, so even without the full mvar case you want uninterruptible
mask too.

On Sat, Jul 2, 2016, 16:42 Yuras [email protected] wrote:

If you disallow them you do it via uninterruptible mask.

I don't care. If you allow cleanup to be interrupted, then you need
uninterruptibleMask_ in bracket for correctness. Otherwise you don't. We
are talking about bracket here.

withMVar needs uninterruptible mask to be implemented correctly, so I'm
not sure what you mean.

You know that I know that you know what I mean. withMVar uses putMVar as
a cleanup, and it is uninterruptible already because the MVar is empty
(ignoring that someone can putMVar in the middle, but it is not the case
for most use cases. You are going to attack this corner case, but we
discussed it already multiple times, no need to repeat. This corner case is
documented.)


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#3 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AAC_A4wXetes2B-xefjiQKGk8g9BbLcaks5qRmrEgaJpZM4I8KmA
.

@snoyberg
Copy link
Member Author

snoyberg commented Jul 3, 2016

Despite my relative quiet, I have been following along, and the discussion has helped me understand things quite a bit better. Let me take a stab at clarifying points again:

  • We have three different types of exceptions:
    • Fully synchronous exceptions, which are generated as a result of an action failing somehow, e.g., throwIO
    • Fully asynchronous exceptions, which are generated when another thread or the runtime system is trying to kill the current thread (via throwTo) or report an unrecoverable[1] situation like a StackOverflow
    • Interrupts, which fall into an in-between category. They are like synchronous exceptions, in that they can only be thrown as the result of an interruptible IO action occurring (as opposed to fully asynchronous exceptions, which can occur anywhere, including pure code evaluation). But they are like asynchronous exceptions too, in that they are generated by something outside of the current thread[2].

So to once again see where we all agree: there is no way to have an cleanup function run reliably if fully async exceptions can be thrown to it, since there is no way for the cleanup function to guarantee it will perform the cleanup before such an async exception is thrown. At the opposite end, with fully synchronous exceptions, there is nothing that bracket can do to help: it is fully the responsibility of the cleanup function itself to ensure that it behaves correctly in the presence of a self-generated synchronous exception. (If there is disagreement up until this point, please say so!)

So we're once again left with the interrupt case falling in the middle. I think we all agree that it's technically possible for a cleanup function to be written correctly with interrupts turn on (i.e., using mask and not uninterruptibleMask). The minimal proof for this is that the cleanup function could always be wrapped itself in uninterruptibleMask_, which gives no weaker guarantees than if bracket used it[3]. But if we're going to require all cleanup functions to just wrap themselves in uninterruptibleMask_ anyway, why not just have bracket do it!

I'm going to take a leap for a moment, and say that the real crux of the matter is a slightly orthogonal question: do we need to assume that all IO action may throw an exception at all times? I believe that @Peaker is arguing that, given the correct masking state and upon careful analysis of the actions being used, we can guarantee that some IO actions will not fail, and therefore it is safe in a cleanup function to simply chain such actions together (e.g., foo >> bar). @Yuras seems to be arguing that we should assume that all IO actions may throw an exception (whether synchronous or an interrupt), and therefore an expression like foo >> bar should never be trusted to guarantee bar will be run. If such a guarantee is needed, then it should always be written as foo finally bar.

I won't weigh in with my opinion just yet, I'd really like to make sure I'm understanding the nuances of this discussion before forming an opinion.

[1] I'm not sure "unrecoverable" is strictly true here to be honest
[2] Though, in the case of deadlock detection, you could argue that misusing an MVar or TVar "caused" the interrupt to get thrown.
[3] This is different from normal masking which is necessary, since in the case of interrupts the cleanup function can set up masking before it performs an interruptible IO action.

@Peaker
Copy link

Peaker commented Jul 4, 2016

I am actually all for using foo finally bar where foo may or may not fail -- it makes the code more resilient, so why not?

I just claim that while foo finally bar may be enough for async-safety, it is not enough for cleanup-context-safety. You also are going to need an uninterruptibleMask, and that this mask only makes sense due to the use of the IO action in bracket, and not due to the intrinsic nature of the IO action.

I think @Yuras is claiming that you have to give thought to exceptions anyway, so it isn't beneficial to remove just one source of exceptions from cleanup handlers (where each and every exception in cleanup context is a bug). I am claiming that reducing exception sources where each exception is a bug helps the situation, even if the result still may throw some exceptions (and obviously so when there are no other exception sources).

@snoyberg
Copy link
Member Author

snoyberg commented Jul 4, 2016

I'm glad you clarified, I missed this point. To state it slightly differently: we (almost) never want to allow an IO action to be interrupted in a cleanup handler, since that will break the guarantees we're looking for (namely, that cleanup actually happens). It's true that some actions in a cleanup handler may not be interruptible in the first place, but for those actions, using uninterruptibleMask has no impact anyway, so we may as well use it there to. Therefore, there's no case for which uninterruptibleMask does something that doesn't need to be done anyway for a resilient cleanup function, so we should include it in bracket. This doesn't address @Yuras's concern about synchronous exceptions and the need for foo finally bar, as that's orthogonal.

There is the possibility that there is some cleanup function out there which does not need to be interrupt-safe, may block, and we'd prefer that it be interruptible to avoid a long delay in execution. One example of this I've heard mentioned is sending a final "goodbye" message over a TCP connection. But in such a case, it could be argued that this isn't a true cleanup function, and for such a corner case we could provide a separate helper combinator (like finallyInterruptible).

Hope I'm still making sense at this point and not simply rambling :)

@Peaker
Copy link

Peaker commented Jul 4, 2016

Completely agree with this last comment.

On Mon, Jul 4, 2016, 11:39 Michael Snoyman [email protected] wrote:

I'm glad you clarified, I missed this point. To state it slightly
differently: we (almost) never want to allow an IO action to be
interrupted in a cleanup handler, since that will break the guarantees
we're looking for (namely, that cleanup actually happens). It's true that
some actions in a cleanup handler may not be interruptible in the first
place, but for those actions, using uninterruptibleMask has no impact
anyway, so we may as well use it there to. Therefore, there's no case for
which uninterruptibleMask does something that doesn't need to be done
anyway for a resilient cleanup function, so we should include it in
bracket. This doesn't address @Yuras https://github.com/Yuras's
concern about synchronous exceptions and the need for foo finally bar,
as that's orthogonal.

There is the possibility that there is some cleanup function out there
which does not need to be interrupt-safe, may block, and we'd prefer that
it be interruptible to avoid a long delay in execution. One example of this
I've heard mentioned is sending a final "goodbye" message over a TCP
connection. But in such a case, it could be argued that this isn't a true
cleanup function, and for such a corner case we could provide a separate
helper combinator (like finallyInterruptible).

Hope I'm still making sense at this point and not simply rambling :)


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#3 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AAC_A5BfOhkvnVw5EHV9-zo9FY3W2YYGks5qSMawgaJpZM4I8KmA
.

@Yuras
Copy link

Yuras commented Jul 4, 2016

we (almost) never want to allow an IO action to be interrupted in a cleanup handler, since that will break the guarantees we're looking for (namely, that cleanup actually happens).

I think it is incorrect, you may disallow interruptible cleanups if you want, but it is not necessary to ensure cleanup happens. It is enough to require cleanup to release resources in case of any failure (including async exception.) It is the approach the standard library tries to follow right now, e.g. hClose is interruptible (because it blocks on hFlush) yet it closes the file descriptor even if it is interrupted. It makes cleanups modular (so you can combine them) without uninterruptibleMask_ everywhere.

For example, closing two Handles in a cleanup is as easy as

hClose h1 `finally` hClose h2

We don't need uninterruptibleMask_ here, and we don't need to handle async exceptions separately. (Yes, hClose could leak when the Handle is concurrently accessed from different threads because it could block on takeMVar, but it is just a bug, takeMVar here should be surrounded by uninterruptibleMask_)

Let me state again that is it very important to be explicit what "correct cleanup" means, see the first comment of mine. It is important to make sure cleanups are modular -- you can combine simple cleanups into complex ones preserving correctness. In a nutshell, if your cleanups leak in case of async exception, then you need uninterruptibleMask_ everywhere, otherwise you don't.

@Peaker
Copy link

Peaker commented Jul 4, 2016

e.g. hClose is interruptible (because it blocks on hFlush) yet it closes the file descriptor even if it is interrupted

This is actually bad behavior:

A) The hFlush may well be critical to the user of hClose. (If we're willing to forego hFlush, why do we waste time on it anyway?)

B) hClose may be used outside the context of a cleanup. In that case, we ideally want an exception result to avoid visible side effects. i.e: I expect hClose throwing exceptions (async or not) to not close the file.

We don't need uninterruptibleMask_ here,

We do, unless we don't care about the hFlush, in which case hClose shouldn't call it in the first place, and then uninterruptibleMask is not damaging.

It is important to make sure cleanups are modular

They are modular, combine them via (>>) or finally as sensible. Where the uninterruptibleMask is the responsibility of the bracket.

then you need uninterruptibleMask_ everywhere, otherwise you don't

As I said above, hClose should "leak" when it throws an exception outside bracket context. We all prefer functions to not perform their side-effects when they fail.

Consider this bracket:

bracket closeTheFile reopenTheFile

If the hFlush fails during the acquire/close phase, I expect the file to remain open.

@snoyberg
Copy link
Member Author

snoyberg commented Jul 4, 2016

I think it's fair to say that the expected behavior for withFile is that the buffer will be flushed before closing. It seems like there's a legitimate argument to be made that there are two valid semantics for this:

  1. When closing the file handle, block until the buffer has flushed and then close it. Do not allow the flushing to be interrupted. Closing the file handle should not be interruptible.
  2. Allow the flushing to be interrupted, but regardless close the file descriptor.

In any event, if the flushing fails due to a synchronous exception, the file descriptor needs to be closed.

Given that hClose is doing something very special (including an arguably non-cleanup action together with a cleanup action), we probably don't want to base our general decision about bracket based on this special case. It's clear that with the Control.Exception.bracket semantics, we can get either of the above desired semantics for hClose (ignoring the fact that hClose currently has an MVar-related bug in it right now). With @Peaker's uninterruptible semantics, semantics (1) above would be the automatic behavior, and (2) would still be expressible, e.g.:

withFile fp mode inner = bracket
    (openFile fp mode)
    hCloseNoFlush
    (\h -> inner h <* hFlush h) -- or `finally` hFlush h

@Yuras
Copy link

Yuras commented Jul 4, 2016

This is actually bad behavior

Probably yes, probably no. How does it changes the fact that "you may disallow interruptible cleanups if you want, but it is not necessary to ensure cleanup happens"?

The hFlush may well be critical to the user of hClose.

It may or may not be critical. If you accept the David Abrahams' exception safety guidelines, then it is not critical, and even undesired.

hClose may be used outside the context of a cleanup

I thought we already agreed that acquire, cleanup and regular actions are not interchangeable.

@Yuras
Copy link

Yuras commented Jul 4, 2016

I think it's fair to say that the expected behavior for withFile is that the buffer will be flushed before closing.

Only if withFile succeeded. If it fails (e.g. with async exception), then we don't care about it's side effects (and it is even desirable to minimize side effects, i.e. don't flush buffers).

@snoyberg
Copy link
Member Author

snoyberg commented Jul 4, 2016

Only if withFile succeeded. If it fails (e.g. with async exception), then we don't care about it's side effects (and it is even desirable to minimize side effects, i.e. don't flush buffers).

I agree. That seems to argue that the simplistic bracket openFile hClose will never behave correctly.

@Peaker
Copy link

Peaker commented Jul 4, 2016

I thought we already agreed that acquire, cleanup and regular actions are not interchangeable.

In the sense that cleanup actions have to be nothrow (or as close to that as possible), and others may throw freely, yes.

But a nothrow action can be used as an acquire, cleanup or regular action freely (iff bracket does the uninterruptibleMask on the cleanup).

@Yuras
Copy link

Yuras commented Jul 4, 2016

I agree. That seems to argue that the simplistic bracket openFile hClose will never behave correctly.

Right. That is why I care about exception handling policy, not about uninterruptibleMask_ per se. Without the policy you can't say "this code is correct" or "this code is wrong because...", people understand exception handling differently. It is easy to add uninterruptibleMask_, but it will be impossible to remove if we'll find better solution.

If CLC agrees that cleanups should never be interruptible, then I don't care about uninterruptibleMask_ in bracket. But it is irresponsible to add uninterruptibleMask_ without considering all consequences and solving the issue as whole.

(Though bracket openFile hCloseNoFlush (\h -> action h <* hFlush h) will be correct)

But a nothrow action can be used as an acquire, cleanup or regular action freely

Except that you can't make hClose nothrow.

@snoyberg
Copy link
Member Author

snoyberg commented Jul 4, 2016

So maybe we're not all that far apart from each other on thoughts here. I think of this package as a testing ground for changes that should be moved upstream to base: ideally the sync vs async distinction these functions make would eventually be respected in Control.Exception as well. So by that, I don't believe we should wait for base to change, but should instead establish what base should be. That should include writing up a clear exception handling policy, as you describe.

I'm currently leaning towards the belief that we have the right policy here now, that cleanups should be uninterruptible, and that withFile and hClose are both simply broken. I'd love to see us move ahead with a collaborative document explaining the exception handling policy, and ultimately present it to the CLC as a proposal for what should be part of base itself at some point in the future.

@Yuras
Copy link

Yuras commented Jul 4, 2016

I'm currently leaning towards the belief that we have the right policy here now, that cleanups should be uninterruptible, and that withFile and hClose are both simply broken.

I agree. In fact it is exactly what I proposed in the very first comment here. Though backward compatibility is a big issue here.

I'd love to see us move ahead with a collaborative document explaining the exception handling policy

I'm going to prepare a draft as I see it (will probably take time.)

@snoyberg
Copy link
Member Author

snoyberg commented Jul 4, 2016

That sounds great, thank you! And I'm glad we had this discussion here, at the very least it helped me understand the issue much better, and hopefully will be a good source for clarifying a document and perhaps educating others.

@ezyang
Copy link

ezyang commented Jul 4, 2016

@Yuras, @ElastiLotem , @Peaker, are any of you coming to HiW? If so, you should submit a talk about asynchronous exceptions to help air this out with the rest of the community. (@snoyberg I'd also include you, but IIRC HiW this year is on your sabbath :( )

@snoyberg
Copy link
Member Author

snoyberg commented Jul 4, 2016

Unfortunately I'm not going to be able to make it to ICFP at all this year it seems :(. But I definitely endorse this idea, a talk on the subject would be great! I'd be happy to review any notes if it would be helpful to anyone.

@Yuras
Copy link

Yuras commented Jul 4, 2016

@snoyberg I prepared the draft. It doesn't cover everything, but I'd like to synchronize different POVs.

@ezyang Sorry, I don't know what HiW is (it probably means I'm not going to participate :) ). I agree that we need more resources on (async) exceptions. But I think I'm wrong person to give a talk -- I have no experience in that, and my English is pretty limited (especially spoken).

@ezyang
Copy link

ezyang commented Jul 4, 2016

It's the Haskell Implementor's Workshop (https://wiki.haskell.org/HaskellImplementorsWorkshop). This year it is in Nara, which might be a bit difficult to get to, depending on where you are. But you should at least take a look!

@snoyberg
Copy link
Member Author

snoyberg commented Jul 4, 2016

@Yuras I'll review it now. Could I make a recommendation on review method though? If you open up a pull request against this repo adding a new .md file with this content, it will be really easy for people to provide inline feedback, or to even send you PRs against your branch. I'm also happy to give you write access to this repo if you'd like to do the draft on a branch on the main repo.

@Yuras
Copy link

Yuras commented Jul 4, 2016

@snoyberg Sure, see #12

@snoyberg
Copy link
Member Author

Since #12 was closed, I've started a new tutorial to cover this, with a PR started at: haskell-lang/haskell-lang#62

@snoyberg
Copy link
Member Author

That tutorial has been merged in and is now live. We seem to have come to a conclusion on best practices for now, so closing.

@chessai
Copy link

chessai commented Oct 28, 2020

Has anyone opened up a GHC ticket about this?

This conversation is great

@snoyberg
Copy link
Member Author

I've created an issue here: https://gitlab.haskell.org/ghc/ghc/-/issues/18899

Thank you for the suggestion @chessai

@avieth
Copy link

avieth commented May 27, 2022

Curious to hear examples of what people are doing in their bracket release continuation that requires an uninterruptible mask. My intuition is that cases in which an uninterruptible mask of a release action is actually required and beneficial are incredibly rare (but I also lack imagination, so please prove me wrong!).

I skimmed this thread and saw hClose came up as an example, arguing that bracket open close on some file should guarantee it's flushed. But does that really buy us much? If your program depends upon the integrity of some file on disk, then a guarantee that it gets flushed doesn't solve your problem. You still have to deal with a power failure, disk corruption, layers and layers of buffering between program and hardware (OS returning from an fsync does not guarantee it's written). So the uninterruptible mask hasn't solved your problem at all. But it has exposed you (and all users of this library) to more deadlock hazards.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants