Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CE3: ExitCase.Completed #643

Closed
alexandru opened this issue Sep 18, 2019 · 13 comments
Closed

CE3: ExitCase.Completed #643

alexandru opened this issue Sep 18, 2019 · 13 comments

Comments

@alexandru
Copy link
Member

alexandru commented Sep 18, 2019

The ExitCase in the CE 3 proposal is this:

sealed trait ExitCase[+E, +A]

object ExitCase {
  final case class Completed[+A](a: A) extends ExitCase[Nothing, A]
  final case class Errored[+E](e: E) extends ExitCase[E, Nothing]
  case object Canceled extends ExitCase[Nothing, Nothing]
}

And I worry specifically about this case:

final case class Completed[+A](a: A) extends ExitCase[Nothing, A]

The reason for why the current Completed does not have a: A in it is for it to be able to be used:

  1. for data types that don't "terminate" with exactly one result, since there's absolutely no reason to have that restriction — you've got flatMap if you want to trigger stuff in case of normal termination, you don't need bracket to act as a flatMap as well
  2. even for data types that can complete with exactly one result (e.g. EitherT[IO, E, *], the Completed case can be used to signal something other than successful completion ... if Error signals a Throwable, then Completed does NOT necessarily signal success, it could simply be the result of an E error that happened (in the context of EitherT) ... which is why we can implement Sync[EitherT[F, E, *]] for any E

This was already discussed on the PR for Bracket and I admit it's very noisy so hard to spot — but @LukaJCB should be aware of the problem as well:

#113 (comment)

The CE 3 proposal says this:

The first two of the ExitCase cases have corresponding constructors which can build effects that produce to those results:

Completed ==> pure
Errored ==> raiseError

This view is not useful and is where we disagree. I don't see Completed as being the counterpart to pure or Errored as the counterpart to raiseError.

I view Completed as being Unit (aka termination) and Error | Canceled as being Nothing. No more, no less.

@alexandru alexandru changed the title CE3: Concerns about the proposed ExitCase CE3: ExitCase.Completed Sep 18, 2019
@SystemFw
Copy link
Contributor

for data types that don't "terminate" with exactly one result

I used to have this concern as well, but @djspiewak reckons it's not an issue. You have F[ExitCase[Throwable, A]], so for streaming types you will have an Observable[ExitCase[Throwable, A]], with more than one Completed.

you've got flatMap if you want to trigger stuff in case of normal termination, you don't need bracket to act as a flatMap as well

Conversely, if ExitCase has A, then continual can get a much simpler definition in terms of guarantee. The question there is the potential overlap with the cancelable/uncancelable proposal, but that's true for bracket in general (if you have cancelable/uncancelable bracket is not really primitive anymore)

even for data types that can complete with exactly one result (e.g. EitherT[IO, E, *], the Completed case can be used to signal something other than successful completion ... if Error signals a Throwable, then Completed does NOT necessarily signal success, it could simply be the result of an E error that happened (in the context of EitherT) ... which is why we can implement Sync[EitherT[F, E, *]] for any E

This is interesting, I'll have a look at the code for that.

All in all, Completed[A] is a nice to have but personally I'm not wedded to it

@SystemFw
Copy link
Contributor

I've reread the Bracket issue, and according to this #113 (comment) the A in Completed is not a problem for EitherT, it's merely unnecessary. It is (potentially) a problem for streaming types, but at that point the discussion moves into the Region/Bracket territory

@alexandru
Copy link
Member Author

@SystemFw

I've reread the Bracket issue, and according to this #113 (comment) the A in Completed is not a problem for EitherT, it's merely unnecessary. It is (potentially) a problem for streaming types, but at that point the discussion moves into the Region/Bracket territory

It's very much necessary for both EitherT and OptionT, but granted that thread is hard to follow.

The code right now is a little convoluted due to another issue, but I think you can see it:

This is because we are implementing Sync[EitherT[F, L, ?]] with an F: Sync[F] restriction, which means a MonadError[F, Throwable] restriction. This means that:

  1. the Error case is going to be Error(e: Throwable), so in case of Left(error) termination, we can't use it
  2. the A type is still A and not Either[L, A]

This means that in case of a Left(error) result from that EitherT, or a None result from that OptionT, there's nowhere to signal it to.

Having a Completed without an a: A allows for flexibility in what data types we implement, and I've never seen an instance in which this was an insurmountable problem, but I could be wrong.

@alexandru
Copy link
Member Author

@SystemFw Also for streams:

You have F[ExitCase[Throwable, A]], so for streaming types you will have an Observable[ExitCase[Throwable, A]], with more than one Completed.

This is where you might be right, the details are a little fuzzy, but if we implement bracket for streams then indeed each element of the source will be a resource by itself and generate an ExitCase of its own.

@alexandru
Copy link
Member Author

Btw I think OptionT and EitherT qualify as data types that don't terminate in one result. You can view them as data types that on evaluation terminate in zero results. Like how Option can be viewed as a collection of either zero or one elements.

@SystemFw
Copy link
Contributor

I see the problem for OptionT and EitherT. The solution for this problem in Haskell is pretty complex https://lexi-lambda.github.io/blog/2019/09/07/demystifying-monadbasecontrol/

Anyway, my main point for wanting that Completed(A) was implementing continual in library code. So if we manage to address things in other ways, like cancelable/uncancelable or a native continual (which is still a good idea for efficiency), I don't have a strong reason to change Completed, especially not at the expense of sacrificing OptionT/EitherT

@alexandru
Copy link
Member Author

Can you explain what the issue with continual is?

Have you written about it somewhere?

Of course I would be interested in continual. 🙂

@SystemFw
Copy link
Contributor

SystemFw commented Sep 18, 2019

So, this is the polymorphic continual I arrived at (with some help from Oleg)

 def continual[F[_], A, B](fa: F[A])(f: Either[Throwable, A] => F[B])
    (implicit F: Concurrent[F]): F[B] = {
    import cats.effect.implicits._
    import scala.util.control.NoStackTrace

    Deferred.uncancelable[F, Either[Throwable, B]].flatMap { r =>
      fa.start.bracket( fiber =>
        fiber.join.guaranteeCase {
          case ExitCase.Completed | ExitCase.Error(_) =>
            fiber.join.attempt.flatMap(f).attempt.flatMap(r.complete)
          case _ => fiber.cancel >>
            r.complete(Left(new Exception("Continual fiber cancelled") with NoStackTrace))
        }.attempt
      )(_ => r.get.void)
      .flatMap(_ => r.get.rethrow)
    }
  }

As you can see, it's short but pretty complex.
Now suppose you are starting from scratch, and want to implement continual in library code:

  • you cannot use flatMap, because it's interruptible
  • you cannot use bracket, because acquire is uncancelable

it seems like you could use guaranteeCase and some minor work, after all it looks close to uncancelable flatMap (aka continual):

 def guaranteeCase[A](fa: F[A])(finalizer: ExitCase[E] => F[Unit]): F[A]

But then you notice that ExitCase.Completed does not have an A in it, and so you can't.
So at the time the rationale for Completed(a: A) was to enable this in a much easier and performant way than what I ended up having to do.

But as I said, this is not a blocker: we could implement continual directly in the internals, or, alternatively, the whole problem sort of goes away with the cancelable/uncancelable proposal

@djspiewak
Copy link
Member

I see the problem you’re getting at, Alex. OptionT is a great example. Need to think about this more…

@djspiewak
Copy link
Member

djspiewak commented Sep 22, 2019

The first two of the ExitCase cases have corresponding constructors which can build effects that produce to those results:

Completed ==> pure
Errored ==> raiseError

This view is not useful and is where we disagree. I don't see Completed as being the counterpart to pure or Errored as the counterpart to raiseError.

The view in question comes from a somewhat more abstract look at things. Maybe the viewpoint isn't useful, but just to explain the thought process a bit better…

Cats Effect's abstractions form a calculus which defines the basis for functional effectful runtimes in Scala. Every calculus is predicated on a series of axioms, which are nothing more than statements we simply assert to be true. All functors define a map operation which, for some F[A] and A => B, produces a F[B]. Why? Because we say they do. (and no, category theory doesn't make this any less arbitrary… I can make the same statement about endofunctors and arrows) All effects must be capable of sequencing callback-driven effects of type Either[Throwable, A] into their F[A]. Why? Because we say they do.

All of this is arbitrary and, on some fundamental level, meaningless. Cue Plato in 3...2...1...

Anyway, bespoke axioms are a necessary fact of useful philosophical (and by extension, mathematical) systems, because you can't derive your own axioms in a consistent system (see: Gödel). However, they're also profoundly annoying because they are so arbitrary. I'm always a lot more comfortable when I conclude something to be factual if I built that conclusion upon logical inferences derived from other facts. I'm much less confident if I say something is factual "because I say it is". All Functors define map "because I say they should" is a deeply uncomfortable statement, even though it's a very useful one.

Linking this back to Cats Effect… One of my goals with CE3 was to reduce the axiomatic space in the calculus of effects. The more functionality we have which is the way it is "because we say it is", the less confident we can be that we actually made something meaningful. Minimizing the calculus to a smaller set of axioms which compose to produce a system consistent with what we want to express is a major goal of nearly all of mathematics. (we can quibble about why minimization is such a major goal, but… it is, and all of modern systematic thought is built upon it)

Concretely in Scala, this means trying to have more and more of Cats Effect's functionality expressed in terms of fewer and fewer abstract functions on the typeclasses. It means being able to derive more and more laws (e.g. the ones relating bracket and uncancelable), rather than having to come up with them semi-arbitrarily in response to possible edge cases.

Of course, there is a back and forth here. We can't just design in a vacuum sealed at the top of our Ivory Tower. Axioms which cannot be implemented efficiently by concrete datatypes are bad axioms, and their existence would need to be very seriously justified to merit inclusion in the calculus. Axioms which indirectly preclude a swathe of desirable implementations or use-cases are also bad. Conversely, axioms which indirectly force efficient or safe implementations, or which enable desirable use-cases, are very good axioms. A particularly strong axiom is one which applies its implications to numerous orthogonal desirable attributes.

I feel like I'm rambling… The point being that there's some give and take here. The core calculus of Cats Effect cannot be designed without regards to how it will be implemented and how it will be used, but conversely we also should not make changes to that calculus solely based on how implementations or use-cases generally work today. Instead, we should try to take those implementations and use-cases and work backwards to the first principles which give rise to them, and from that kernel of truth see if we can derive a clean axiomatic framework into which it fits. The contrapositive of this is taking undesirable implementations or use-cases that we see today, and work backwards to the first principles which allow us to form a clean axiomatic framework which precludes them.

An example of the latter would be the very common mistake that almost everyone has made at least once with async, where they forget to shift back to their origin thread pool after handling the callback. Another problem in this vein is APIs which take a ContextShift and an ExecutionContext, because they need the EC to interact with legacy APIs (nothing else will suffice), but they also need ContextShift because that's the only way (without writing your own async) of thread shifting. I consider both of these outcomes to be undesirable, so working back to the first principles, we see that introducing a stronger evalOn and associated laws around async resolves the first one. However, a stronger evalOn is isomorphic to Kleisli#local, which in turn implies that we can express this particular axiom (governing async affinity) in terms of another, broader axiom in the FP space: specifically, MonadReader. Now if we're talking about evalOn as local and the effect itself as forming a Reader of an ExecutionContext, then that implies there must be an ask. And that ask neatly resolves the second problem I referenced, which is the fact that APIs often take both a ContextShift and an ExecutionContext. If we provide a way of surfacing the ExecutionContext that we already know must be accessible within the effect (because evalOn and async imply the Reader), then we can solve two problems with a single axiom, and we don't even need to invent that axiom, it's something that already exists and has sane laws around it. Hence, the executionContext: F[ExecutionContext] function.

This is the kind of thing I'm talking about. Now, not every use-case in the CE3 API is quite this clean, but hopefully the thought process and design intent is clear. Or at least, as clear as I can make it. Sorry for the vast prose.


So how does all of this relate to ExitCase? Well, the nice thing about ExitCase in its current ce3 form is it is a single axiom which represents three orthogonal things simultaneously:

  • Signalling to the bracket release (avoiding the need to use a Ref to thread outcomes)
  • Signalling to the join on a Fiber (avoiding undefined behavior in the event of cancel)
  • Forming the core algebra of a minimal Concurrent/Temporal implementation

If you ignore Fiber#cancel and suspension of side-effects, Free[ExitCase[E, ?], A] is effectively what an effect is. This is very compelling to me, because it means that we can talk about what effects are solely in terms of three algebraic cases: Completed, Errored, and Canceled. Of course, Fiber#cancel messes this up a little bit, which bothers me (and I'm still thinking about ways around it), as do the time functions, but the picture as a whole is very compelling. It's compelling because it's simple, and it's simple because it's expressing a large set of things in terms of a small set of axioms.

Unfortunately, you're completely right that OptionT (and analogous) screw up using this as a bracket signalling mechanism. Another way of thinking about this: ExitCase (with the A in Completed) represents the algebra of an Effect, but it doesn't necessarily represent the algebra of an arbitrary Concurrent, or even a Bracket. So that makes me wonder if we need something a bit less constrained.

Technically, Bracket already has the machinery to get around this: the Case constructor is abstract, and can be anything which forms a MonadError (and a Traverse, but I think we don't need that second constraint). Now, Concurrent needs a stronger guarantee, which is that Case can embed the Canceled signal, but it doesn't need to be ExitCase itself! We could resolve this by forcing an InjectK[ExitCase[E, ?], Case] constraint in Concurrent. I don't know if this is particularly elegant, but it's one answer.

As a note, these sorts of type level shenanigans tend to be very annoying for end-users to deal with, due to the vastly more complicated types and awkward syntax, so I don't think we should go down this road without significant justification. It also (analogously) makes the type of Concurrent itself much, much more complex, and more complex types are harder to reason about in general. I'd rather have a simpler framework if we can. So, really this is just a thought, but it does illustrate one way to solve the problem without taking the A away from Completed.

@djspiewak
Copy link
Member

djspiewak commented Dec 22, 2019

I think I've solved this, now that I had time to loop back to things. The new ExitCase algebra:

sealed trait ExitCase[+F[_], +E, +A] extends Product with Serializable

object ExitCase {
  final case class Completed[F[_], A](fa: F[A]) extends ExitCase[F, Nothing, A]
  final case class Errored[E](e: E) extends ExitCase[Nothing, E, Nothing]
  case object Canceled extends ExitCase[Nothing, Nothing, Nothing]
}

I don't like complicating the kind even further, but I think it carries its weight. This should solve the problem with both Fiber#join and bracketCase, since now the result is wrapped within F. Note that we haven't lost (much) parametricity on the eagerness implications, since Errored and Canceled are still outside the effect scope. So the signature of Fiber#join, for example:

def join: F[ExitCase[F, E, A]]

Which implies that if you want the current CE2 semantics, you can do the following:

f.join.flatMap(_.fold(never, raiseError(_), fa => fa))

Laws will guarantee that implementors must represent errors and cancelation within use as the various cases, rather than simply forwarding them along within the F[A]. In fact, in most cases, I would expect implementors would imply use pure[F] to constructor their F[A].

All in all, this is more general than the current state in master and doesn't require users to manually manage closed Refs and such in order to propagate the result into the finalizer.

@alexandru
Copy link
Member Author

Ah, interesting.

Can it work for EitherT[F, ?]? What about streams?

If it does, then it's not bad.

@djspiewak
Copy link
Member

djspiewak commented Dec 23, 2019

It definitely works for EitherT and it should work for streams. It’s a bit bizarre though since you would need to either determine the exit result for the entire stream before running it, or the outer effect would have multiples and you would return multiple cases (as I originally designed for it). I suspect that the laws will force this, if you apply it to a stream.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants