-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Interruptible or uninterruptible cleanup #3
Comments
I agree that this issue is orthogonal to #2. I'll replicate and extend some of my arguments here. Basically we have two issues:
Re 1. People often claim that it is a bug if cleanup action is interruptible. And I'd probably agree. Unfortunately it was never documented as a requirement. Even worse, If we decide that cleanup actions should not be interruptible, than we should convince CLC to accept it as a policy and document it in If we allow interruptible cleanups, then Re 2. Here we speak about But as I already showed in the original issue, Other common argument is: people are not educated enough to handle async exceptions correctly in cleanups, so lets just disable them at all. But we can't disable them in alloc actions, so we have to educate people anyway or remove async exceptions from the language completely. I don't believe that people will write sync- or async-exception-safe code by incident, without understanding how they work. And In any case, it is less important whether |
I find disabling asynchronous exceptions in cleanup distasteful. Consider all normal code and library routines - they must be async safe and can take unlimited time. Consider exception handlers - should they use the same principles? Or an entirely different set? That convinced me that mask is a better option. |
The cleanup can be completely async safe which just means that its side But for the cleanup context, you do not want the cleanup undone, ever. On Wednesday, 29 June 2016, Neil Mitchell [email protected] wrote:
Sent from Gmail Mobile |
Let me try and put in a summary of what's agreed upon and not agreed upon, then others can correct me:
Have I accurately explained the goals and different options here? |
@snoyberg Here are my comments: "interruptible exceptions" => "interruptible actions". I agree with points 1-2. Points 3-4:. Here are 2 valid and sensible brackets for a semaphore: In both of these, we really want the alloc-action to be interruptible (or unnecessarily lose responsiveness and even hit deadlocks), and the cleanup action to be uninterruptible (or our semaphore invariants get broken!). Neither of these ( So the question isn't whether the responsibility for the Since the vast majority of bracket invocations will want to One final point: The word "async-safe" is overloaded here and I think is causing some confusion. For something to be declared "async-safe", it needs to undo its side-effects successfully (or otherwise guarantee no visible effects were leaked) upon an async exception.
|
My current understanding is that the only way to have a "safe" bracket (with the guarantee that all operations will be rolled back upon completion) is to mask all async exceptions on both the acquire and release. My gut feeling is that writing code that blocks async exceptions means we lose many of the benefits of async exceptions - an openFile might talk to a network mounted NFS drive and take 30s or so in the worst case. However, this library seems to be about making sensible compromises to make the current stuff work nicely, rather than redefining policy. As such, blocking async exceptions might be the right thing to do here, even though it still leaves me feeling icky. |
@ndmitchell Uninterruptible-masking acquire is not required for full safety. The acquire action itself just has to be async-safe, i.e: it will internally roll back as needed upon an async exception. The reason interruptible masking is needed for acquire is that an async exception can happen between the completion of the acquire and the installation of our exception handler for cleanup. Ordinary mask covers that part as it has no interruption points. Uninterruptible masking is only needed on the release side. |
@Peaker I feel I'm out of my depth here, you may well be right, I think I'll just lurk and observe this thread :) |
@Peaker I think you "final note" is just about cleanup actions be different from all other actions. Indeed it is true -- other actions should usually minimize side effects of failure, while cleanups should release resources in any case. But it contradicts with the first part of your comment. If cleanups are different, then you can't use them as acquire actions or vice versa. For example,
This implementations is invalid both as acquire and cleanup action (but could be OK as a regular action). To be used as an acquire action, it should guarantee that the token will not be posted in case of failure:
To be used as a cleanup action, it should guarantee the token will be posted in case of failure:
So, while sometimes an action could be used both as a acquire and cleanup action, it is not the case in general. And you can't convert acquire action into cleanup action using |
I don't think (putStrLn "..." >> acquire >> putStrLn "...") with no masking is ever a valid action (even "regular" action), because it is not exception-safe in any way. You need the release in case of exception to make it exception-safe(sync&async), which you want in acquire actions in an "regular" actions. That reduces us to regular actions vs. cleanup actions. In addition and independently of avoiding sync exceptions, you never want to handle async exceptions in the cleanup context, so Ideally, Haskell would have some phantom type parameter to |
Not necessary. Depending on a context, the same thing may or may not be a "resource". E.g. if you failed to acquire/release connection from/to a pool, then you can just drain the pool itself -- no resource leak. In that case we have "basic" safety level. Other example is
I don't see any difference between sync and async exceptions here. In my
Hmm, we may require cleanups not to throw sync exceptions by swallowing them. But we can't make sure exceptions never happen (except few cases where we can achieve "no-throw" safety level.) There is no way to ensure |
The way you handle them is only good for sync exceptions. If putStrLn throws any exception (Sync or async), then
It is not so few -- you can build a whole world of functionality around But otherwise, I agree that you may often need to swallow exceptions in cleanup handlers. |
No,
Right, you can build a lot of no-throw code. But you can't do it incidentally. It is hard to achieve no-throw. And only after you did the hard work, it makes sense to do the last easy step -- add @snoyberg I'm not sure this discussion is useful for you. We discussed it with @Peaker a number of times already, we are just repeating the same arguments. But I think the most fundamental thing is to decide how correct cleanup should behave: is it allowed to it to be interruptible, may it fail at all, should it release resources in case of sync or async failure. |
Might I suggest than @snoyberg does whatever is done in the base library, and should anyone disagree with that approach, leave it for them to convince the base library to fix itself first, and then have this package follow along. |
This is the crux of the disagreement, I believe. I believe each different circumstance for which the action leaks a resource is a separate bug. For example, the above code with putStrLn may have 2 actual bugs in realstic configurations (where stdout write only blocks but never fails). A) Leaks when stdout was/is closed You claim that fixing B is useless because you've not also fixed A. I claim fixing B is a huge benefit, even if A was not yet fixed. More-over, sync exceptions during cleanup are an obvious concern. Virtually any Haskeller would understand the need to do it. uninterruptibleMask for cleanup by default means: B) even if you're aware of the need to uninterruptibly mask your cleanups -- it saves you from the need of doing it for every single cleanup action in existence C) It brings the program to a less buggy state. A program is not buggy / not-buggy. It can be more or less buggy. |
@ndmitchell In the mailing list, I think besides @Yuras there was a strong consenesus that bracket with uninterruptibleMask in cleanups is far preferable -- with few voicing concern about breaking legacy code that assumes the current broken behavior. Fixing base is hard and has backwards compatibility concerns. A new library copying existing broken semantics makes the inevitable fix all the more painful later. |
Yes.
IMO if the cause is the same, then it is one bug. Otherwise they are separate.
Both of them require the same fix. The fix is specific neither to sync no async exceptions.
It is a huge overestimate.
What is less buggy, this Most of bugs in cleanups are not async-specific IME, so most of bugs about async exceptions indicate a bug with sync exceptions. The "fix" you are proposing (uninterruptibleMask) is similar to changing the number in the define above. |
One requires slapping on How are they the same?
That is a strawman argument. Of course if you can "just fix it", then do. But any fix will involve There is a lot of software that is correct up to events of 2^-160. Is it just as incorrect as any buggy software?
How are they "sync fixes" even related to the async fix which is
No, it is more akin to:
where both A and B are supposed to be true but may be buggy(false). I convert it to: #define true A |
Both require just a
|
The finally is only needed because of the silly prints. |
I don't understand what you mean here. I didn't say anything about Let me summarize.
|
|
You are arguing with someone else it seems.
It is wrong. |
If you disallow them you do it via uninterruptible mask. withMVar needs uninterruptible mask to be implemented correctly, so I'm not sure what you mean. |
I don't care. If you allow cleanup to be interrupted, then you need
You know that I know that you know what I mean. |
You're implying malintent, but it's not there, I really don't understand I'm quite lost about the first point, it's like we speak different For the latter point, I disagree that: On Sat, Jul 2, 2016, 16:42 Yuras [email protected] wrote:
|
Despite my relative quiet, I have been following along, and the discussion has helped me understand things quite a bit better. Let me take a stab at clarifying points again:
So to once again see where we all agree: there is no way to have an cleanup function run reliably if fully async exceptions can be thrown to it, since there is no way for the cleanup function to guarantee it will perform the cleanup before such an async exception is thrown. At the opposite end, with fully synchronous exceptions, there is nothing that So we're once again left with the interrupt case falling in the middle. I think we all agree that it's technically possible for a cleanup function to be written correctly with interrupts turn on (i.e., using I'm going to take a leap for a moment, and say that the real crux of the matter is a slightly orthogonal question: do we need to assume that all I won't weigh in with my opinion just yet, I'd really like to make sure I'm understanding the nuances of this discussion before forming an opinion. [1] I'm not sure "unrecoverable" is strictly true here to be honest |
I am actually all for using I just claim that while I think @Yuras is claiming that you have to give thought to exceptions anyway, so it isn't beneficial to remove just one source of exceptions from cleanup handlers (where each and every exception in cleanup context is a bug). I am claiming that reducing exception sources where each exception is a bug helps the situation, even if the result still may throw some exceptions (and obviously so when there are no other exception sources). |
I'm glad you clarified, I missed this point. To state it slightly differently: we (almost) never want to allow an There is the possibility that there is some cleanup function out there which does not need to be interrupt-safe, may block, and we'd prefer that it be interruptible to avoid a long delay in execution. One example of this I've heard mentioned is sending a final "goodbye" message over a TCP connection. But in such a case, it could be argued that this isn't a true cleanup function, and for such a corner case we could provide a separate helper combinator (like Hope I'm still making sense at this point and not simply rambling :) |
Completely agree with this last comment. On Mon, Jul 4, 2016, 11:39 Michael Snoyman [email protected] wrote:
|
I think it is incorrect, you may disallow interruptible cleanups if you want, but it is not necessary to ensure cleanup happens. It is enough to require cleanup to release resources in case of any failure (including async exception.) It is the approach the standard library tries to follow right now, e.g. For example, closing two hClose h1 `finally` hClose h2 We don't need Let me state again that is it very important to be explicit what "correct cleanup" means, see the first comment of mine. It is important to make sure cleanups are modular -- you can combine simple cleanups into complex ones preserving correctness. In a nutshell, if your cleanups leak in case of async exception, then you need |
This is actually bad behavior: A) The B)
We do, unless we don't care about the
They are modular, combine them via (>>) or
As I said above, Consider this bracket:
If the |
I think it's fair to say that the expected behavior for
In any event, if the flushing fails due to a synchronous exception, the file descriptor needs to be closed. Given that withFile fp mode inner = bracket
(openFile fp mode)
hCloseNoFlush
(\h -> inner h <* hFlush h) -- or `finally` hFlush h |
Probably yes, probably no. How does it changes the fact that "you may disallow interruptible cleanups if you want, but it is not necessary to ensure cleanup happens"?
It may or may not be critical. If you accept the David Abrahams' exception safety guidelines, then it is not critical, and even undesired.
I thought we already agreed that acquire, cleanup and regular actions are not interchangeable. |
Only if |
I agree. That seems to argue that the simplistic |
In the sense that cleanup actions have to be But a |
Right. That is why I care about exception handling policy, not about If CLC agrees that cleanups should never be interruptible, then I don't care about (Though
Except that you can't make |
So maybe we're not all that far apart from each other on thoughts here. I think of this package as a testing ground for changes that should be moved upstream to I'm currently leaning towards the belief that we have the right policy here now, that cleanups should be uninterruptible, and that |
I agree. In fact it is exactly what I proposed in the very first comment here. Though backward compatibility is a big issue here.
I'm going to prepare a draft as I see it (will probably take time.) |
That sounds great, thank you! And I'm glad we had this discussion here, at the very least it helped me understand the issue much better, and hopefully will be a good source for clarifying a document and perhaps educating others. |
@Yuras, @ElastiLotem , @Peaker, are any of you coming to HiW? If so, you should submit a talk about asynchronous exceptions to help air this out with the rest of the community. (@snoyberg I'd also include you, but IIRC HiW this year is on your sabbath :( ) |
Unfortunately I'm not going to be able to make it to ICFP at all this year it seems :(. But I definitely endorse this idea, a talk on the subject would be great! I'd be happy to review any notes if it would be helpful to anyone. |
@snoyberg I prepared the draft. It doesn't cover everything, but I'd like to synchronize different POVs. @ezyang Sorry, I don't know what HiW is (it probably means I'm not going to participate :) ). I agree that we need more resources on (async) exceptions. But I think I'm wrong person to give a talk -- I have no experience in that, and my English is pretty limited (especially spoken). |
It's the Haskell Implementor's Workshop (https://wiki.haskell.org/HaskellImplementorsWorkshop). This year it is in Nara, which might be a bit difficult to get to, depending on where you are. But you should at least take a look! |
@Yuras I'll review it now. Could I make a recommendation on review method though? If you open up a pull request against this repo adding a new .md file with this content, it will be really easy for people to provide inline feedback, or to even send you PRs against your branch. I'm also happy to give you write access to this repo if you'd like to do the draft on a branch on the main repo. |
Since #12 was closed, I've started a new tutorial to cover this, with a PR started at: haskell-lang/haskell-lang#62 |
That tutorial has been merged in and is now live. We seem to have come to a conclusion on best practices for now, so closing. |
Has anyone opened up a GHC ticket about this? This conversation is great |
I've created an issue here: https://gitlab.haskell.org/ghc/ghc/-/issues/18899 Thank you for the suggestion @chessai |
Curious to hear examples of what people are doing in their bracket release continuation that requires an uninterruptible mask. My intuition is that cases in which an uninterruptible mask of a release action is actually required and beneficial are incredibly rare (but I also lack imagination, so please prove me wrong!). I skimmed this thread and saw |
Let's have a dedicated issue for this discussion. Pinging @Peaker and @Yuras
The text was updated successfully, but these errors were encountered: