Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Coroutines #561

Closed
nnsgmsone opened this issue Jun 25, 2019 · 30 comments
Closed

Coroutines #561

nnsgmsone opened this issue Jun 25, 2019 · 30 comments
Labels
Feature/Enhancement Request This issue is made to request a feature or an enhancement to an existing one.

Comments

@nnsgmsone
Copy link

I found that the coroutine of v is directly called pthread_create. I think it is possible to add user-level coroutines, so that more energy can be injected. Perhaps can learn from golang's approach and add a runtime. . .

@nnsgmsone nnsgmsone added the Feature/Enhancement Request This issue is made to request a feature or an enhancement to an existing one. label Jun 25, 2019
@spy16
Copy link

spy16 commented Jun 25, 2019

Implementing coroutines (similar to Go approach) has always been the plan and is in the roadmap. See https://vlang.io/docs#concurrency

@nnsgmsone
Copy link
Author

ok

@spytheman
Copy link
Member

@nnsgmsone what does 'more energy can be injected' mean in this context?

@nnsgmsone
Copy link
Author

@spytheman make the routine more useful

@joe-conigliaro
Copy link
Member

As far as I know this was the intention all along, they were Implemented that way to begin with to have something working.

@gslicer
Copy link

gslicer commented Sep 9, 2019

I think the use of subroutines is very limited in contrast to what threads can offer (as long as implented e.g. with the "actor" paradigm, without any semaphores/locks)... so threads shall be still supported.

See this statement:

Why create threads when there are coroutines?

Coroutine methods can be executed piece by piece over time, but all processes are still done by a single main Thread. If a Coroutine attempts to execute a time-consuming operation, the whole application freezes for the time being.

Threads are different. The execution of separate Threads is managed by the operating system. If you have more than one logical CPU, many threads are executed on different CPUs. Thanks to that, any expensive operation will not freeze your application.

@dumblob
Copy link
Contributor

dumblob commented Sep 9, 2019

@gslicer don't worry, there will always be the unsafe package with all primitives to create your own threads with your own condition variables, mutexes, locks etc. I think the topic here is not about this though. I think it's about having a builtin primitive for concurrency and that's fully covered in #1868 .

Thus I think this topic can be closed as it got fully superseded by #1868 .

@gslicer
Copy link

gslicer commented Sep 9, 2019

@gslicer don't worry, there will always be the unsafe package with all primitives to create your own threads with your own condition variables, mutexes, locks etc.

As long it's "unsafe" I'm clearly worrying :)

@nnsgmsone
Copy link
Author

@gslicer I think it's easy to achieve the effect of an actor with the routine and channel.For example, a library I wrote myself is such an effect.

@medvednikov medvednikov changed the title User level coroutine? Coroutines Nov 28, 2020
@crthpl
Copy link
Member

crthpl commented Dec 8, 2020

Will there be a compiler flag to make go start a new thread like it does now?

@atomkirk
Copy link
Contributor

When this is implemented in V, would it be possible to implement as pre-emptive (like erlang) instaed of cooperative (like go)?

@dumblob
Copy link
Contributor

dumblob commented Dec 30, 2020

@atomkirk so far V has built-in "go routines" which are fully preemptive (and I think the consensus is, that it should stay so). This GitHub issue seems to be about a different thing - namely about standard library offering simple pure coroutines (which are by definition non-preemptive).

@atomkirk
Copy link
Contributor

atomkirk commented Dec 30, 2020

@dumblob they can be. Erlang processes are user-level AND preemptive. They are very robust.

its preemptive now because it uses kernel threads which are bulky and expensive.

@dumblob
Copy link
Contributor

dumblob commented Dec 30, 2020

Erlang processes are user-level AND preemptive.

That depends on how the Erlang VM is being executed. If it runs on bare hardware and no non-Erlang SW is being called, then you're right. In any other case Erlang processes are only partially preemptive (i.e. one Erlang process can starve indefinitely leading to stopping the whole Erlang VM). But I digress.

My point was different. V community seems to incline to have built-in support (in the form of V's go routines) for fully preemptive execution while offering non-preemptive alternative (refered to as coroutines) in the standard library (i.e. not built into the language).

@ntrel
Copy link
Contributor

ntrel commented Jan 22, 2021

If a Coroutine attempts to execute a time-consuming operation, the whole application freezes for the time being.

This is not true since Go 1.14:
https://medium.com/a-journey-with-go/go-asynchronous-preemption-b5194227371c

@maddanio
Copy link
Contributor

as I just opened a related feature request and to clarify, having co-routines being pre-emptive is kind of a contradiction in itself. co-routines are routines which can enter and exit at any point, basically making it possible to have many of them "in-flight" without having to resort to threads. this becomes when co-routines start inter-depending, which ould otherwise lead to deadlock and/or very iniefficient use of threads and with large-scale networking, where wasting a thread for each connection has been known to be a very bad design choice at least since apache was implemented :)

@maddanio
Copy link
Contributor

so is there any plan to implement real coroutines, i.e. yieldable functions, that dont require threading? co-routines are there to enable concurrency, which is not the same as parallelism (threads) though the two interact very well

@dumblob
Copy link
Contributor

dumblob commented Aug 25, 2021

V community seems to incline to have built-in support (in the form of V's go routines) for fully preemptive execution while offering non-preemptive alternative (refered to as coroutines) in the standard library (i.e. not built into the language).

so is there any plan to implement real coroutines, i.e. yieldable functions, that dont require threading?

I'd say the answer has two parts:

  1. yes, there are plans for real coroutines - but they'll probably not be very tightly connected to the language but rather a standard library construct mainly to allow easy porting of existing libraries which for some weird reason depend on coroutines behavior and have some problems with full preemptiveness

  2. in the light of Proper support for distributed computing, parallelism and concurrency #1868 there is no actual need for true coroutines (except for point (1) ) as they'll perform worse in some scenarios while not outperforming Proper support for distributed computing, parallelism and concurrency #1868 in any scenario I can imagine (and I don't buy any examples relying on deliberately constructed techniques making it perform few percent better on carefully chosen platforms - sure, due to caching and false sharing etc. you can construct something 5%-10% slower, but that'll definitely be a "weird programming antipattern" and thus case (1) )

@atomkirk
Copy link
Contributor

Imagine I’m building a chat app. If I use erlang, I can get about 500k-1M connections per node. If I use a language with threads, I can handle far fewer per node. This means it costs more to scale it, and the pubsub system is far more stressed because it has to handle a lot more traffic between nodes to handle the same traffic as erlang.

yes, I could write code to imitate coroutine performance using threads, but I could also write code to handle my own memory. Thats not what V is about.

If we rely on a library, I can accidentally starve other coroutines, wont get the same performance as go/erlang and will end up with the same confusing and disappointing concurrency story Ocaml is in right now.

@dumblob
Copy link
Contributor

dumblob commented Aug 25, 2021

@atomkirk please read #1868 properly. It says V shall use only that many threads as there are processing units (CPU cores, FPGAs, ...) and all go routines will be fully preemptively multiplexed among these very few threads. Moreover, the plan is to make the number of threads follow sleeping and waking up of processing units in runtime (to not force a sleeping notebook with 32 CPU cores running currently only one CPU core to save energy to context-switch between 32 threads of a V app). It's a similar design as Go lang uses under the hood, but IMHO strictly better (due to the guaranteed full preemptiveness & support for power saving).

So, please go ahead and read the thread (incl. all links and links in those links...). I hope all the concerns immediately become void by that.

@maddanio
Copy link
Contributor

how would this preemption work? at which points will you yield a coroutine? usually this is done actively by the coroutine entering an async call and "waiting" (i.e. yielding) for the result. thats also the most efficient way. also how will you do this in a library? to do real context switching you need assembly. or for fully integrated coroutines like you have in c++ now you need compiler support.
you could use the c coroutine support actually. afaik the c++ corutine plumbing can also be used in c

@maddanio
Copy link
Contributor

go lang most definitely will do coopoerative multitasking by having coroutines yield on channel action and async io (usually just networking, unless they also implemented async file io). so if you say "like go" you will have to do that, i.e. actively yield coroutines on these statements (channel actions, network io, mutexes, I think those are the important ones).

@maddanio
Copy link
Contributor

the way to have efficient networking resolves around using epoll/select inside the coroutine scheduler and have it know what coroutine to wake up when a certain socket becomes readable, similarly with channels and mutexes

@dumblob
Copy link
Contributor

dumblob commented Aug 25, 2021

how would this preemption work? at which points will you yield a coroutine? usually this is done actively by the coroutine entering an async call and "waiting" (i.e. yielding) for the result. thats also the most efficient way. also how will you do this in a library? to do real context switching you need assembly. or for fully integrated coroutines like you have in c++ now you need compiler support.
you could use the c coroutine support actually. afaik the c++ corutine plumbing can also be used in c

Let me reiterate - please read the whole thread #1868 incl. all links recursively (depth 3 should be enough).

go lang most definitely will do coopoerative multitasking by having coroutines yield on channel action and async io (usually just networking, unless they also implemented async file io). so if you say "like go" you will have to do that, i.e. actively yield coroutines on these statements (channel actions, network io, mutexes, I think those are the important ones).

Partially yes - IMHO yielding under the hood will be done less aggressively than Go lang does, because there'll be the full preemptiveness, so presumably the inserted yields will be put only on critical places chosen based on true performance profiling analysis of representative apps (unlike in Go lang where they have no choice and have to put them really everywhere to make the language kind of work).

@maddanio
Copy link
Contributor

what is "agressive" about cooperative yielding? its simply efficient because it will yield at exactly the points the thread would end up sleeping anyway

@maddanio
Copy link
Contributor

maddanio commented Aug 25, 2021

also isnt the fact that just about any other language (even c) now has cooperative coroutines an indication that they are a good idea?

@dumblob
Copy link
Contributor

dumblob commented Aug 25, 2021

the way to have efficient networking resolves around using epoll/select inside the coroutine scheduler and have it know what coroutine to wake up when a certain socket becomes readable, similarly with channels and mutexes

Well, this supposes that programmers are dumb and will use 1 go routine per 1 request (be it a network request or any other sample from a high-rate stream). Which is one of the dumbest things one can do. Nobody from the Go lang nor Erlang world does this because Go routines in Go (and Erlang processes as well) are still extremely expensive (you can have only smaller millions of them which is by far not enough for a scalable app).

Therefore I'd say V shall actually not make the scheduler this smart. But let's see, maybe someone will provide some measurements and data and in V 1.1 (which is years ahead IMHO) there'll be such a smart scheduler. But definitely not now for V 1.0 because it's a nonsense from my point of view.

what is "agressive" about cooperative yielding? its simply efficient because it will yield at exactly the points the thread would end up sleeping anyway

At some point (if you have too many yields) it becomes less efficient than less frequent preemptying (and it has also other downsides - it increases code size a bit, it disallows good CPU-bound performance optimization, etc.).

Please just finally find few hours to read the thread #1868 recursively (and maybe wait one more day to let the brain calmly absorb it all before proposing other concepts which actually are quite aging already). We'll be here, we won't run away 😉. This topic is not urgent and very old and many people smarter than me have put their thoughts in it - most of it documented in the #1868 thread and recursive links.

also isnt the fact that just about any other language (even c) now has cooperative coroutines an indication that they are a good idea?

Again - there is a plan for coroutines (as part of the standard library, maybe even with some intrinsics). But IMHO it's lower priority. Feel free to make a PR with a potential API (some people already worked on that, but I can't find the links now quickly - just search for them yourself and ask e.g. on Discord).

@maddanio
Copy link
Contributor

maddanio commented Aug 26, 2021

At some point (if you have too many yields) it becomes less efficient than less frequent preemptying (and it has also other downsides - it increases code size a bit, it disallows good CPU-bound performance optimization, etc.).

I am not convinced. How can it be more efficient to let a routine sleep (on a network select or waiting on a mutex or...) than yield it? The thread will idle and eventually be swapped out by the os kernel. Unless you are talking about spin locking. But even that you can model with proper co-routines. Also see this talk where Gor Nishanov applies the new co-routines to micro-optimizations to mask cache line latencies, so I beleive co-routines, at least the new ones that have compiler support, have been driven "all the way down". Also bear in mind that with the new compiler support for co-routines in later c compiler the compiler can even inline across and through yield points.

@dumblob
Copy link
Contributor

dumblob commented Aug 26, 2021

I am not convinced. How can it be more efficient to let a routine sleep (on a network select or waiting on a mutex or...) than yield it? The thread will idle and eventually be swapped out by the os kernel. Unless you are talking about spin locking. But even that you can model with proper co-routines.

Please really devote several hours to reading the whole thread #1868 incl. links. You'll learn among other things about Weave which nicely shows what the performance differences are - but don't forget we're talking about maxing out performance of multiple processing units, not just a single core.

And btw. as I said, V will insert yields internally on important places (preferably according to measurements and not human guesses) like selects, mutexes, etc. But I suppose it'll be on a (much) lower number of places than the recent Go versions started to do (you can read about this also in one of the linked resources from the #1868 thread).

Also see this talk where Gor Nishanov applies the new co-routines to micro-optimizations to mask cache line latencies, so I beleive co-routines, at least the new ones that have compiler support, have been driven "all the way down".

Thanks for the link. Gor explains a cool idea how to implement very lightweight coroutines which highly efficiently leverage modern CPU cache hierarchy. These nano-coroutines are unfortunately something V can't care about. Simply because they don't support scheduling across multiple processing units. In other words they do support only a single processing unit (i.e. only one thread) and based on Gor's explanation this can't be changed without losing some of their benefits.

I'd even guess that e.g. Weave (which offers tasks which is just a slightly different abstraction for the very same coroutine concept) is about as fast as Gor's nano coroutines even if run only on a single core (despite Weave being designed to max out performance of many processing units with different processing powers). Feel free to test it and post your results here to let us reproduce them on more machines.

Also bear in mind that with the new compiler support for co-routines in later c compiler the compiler can even inline across and through yield points.

Inlining micro-optimization sound like a patch to a wrong abstraction. But yes, thanks to this I'd guess it'll catch up with "normal function calls" when it comes overhead on one core.

@maddanio
Copy link
Contributor

I had a quick look at weave, i dont see how it applies to async operations like networking where you have to keep on juggling tasks because most of them simply cannro make progress at any given time due to waiting for io. I think you are fundamentally mixing up concurreny with parallelism.

@vlang vlang locked and limited conversation to collaborators Sep 22, 2021

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
Feature/Enhancement Request This issue is made to request a feature or an enhancement to an existing one.
Projects
None yet
Development

No branches or pull requests