-
-
Notifications
You must be signed in to change notification settings - Fork 2.2k
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Coroutines #561
Comments
Implementing coroutines (similar to Go approach) has always been the plan and is in the roadmap. See https://vlang.io/docs#concurrency |
ok |
@nnsgmsone what does 'more energy can be injected' mean in this context? |
@spytheman make the routine more useful |
As far as I know this was the intention all along, they were Implemented that way to begin with to have something working. |
I think the use of subroutines is very limited in contrast to what threads can offer (as long as implented e.g. with the "actor" paradigm, without any semaphores/locks)... so threads shall be still supported. See this statement:
|
@gslicer don't worry, there will always be the Thus I think this topic can be closed as it got fully superseded by #1868 . |
As long it's "unsafe" I'm clearly worrying :) |
@gslicer I think it's easy to achieve the effect of an actor with the routine and channel.For example, a library I wrote myself is such an effect. |
Will there be a compiler flag to make |
When this is implemented in V, would it be possible to implement as pre-emptive (like erlang) instaed of cooperative (like go)? |
@atomkirk so far V has built-in "go routines" which are fully preemptive (and I think the consensus is, that it should stay so). This GitHub issue seems to be about a different thing - namely about standard library offering simple pure coroutines (which are by definition non-preemptive). |
@dumblob they can be. Erlang processes are user-level AND preemptive. They are very robust. its preemptive now because it uses kernel threads which are bulky and expensive. |
That depends on how the Erlang VM is being executed. If it runs on bare hardware and no non-Erlang SW is being called, then you're right. In any other case Erlang processes are only partially preemptive (i.e. one Erlang process can starve indefinitely leading to stopping the whole Erlang VM). But I digress. My point was different. V community seems to incline to have built-in support (in the form of V's go routines) for fully preemptive execution while offering non-preemptive alternative (refered to as coroutines) in the standard library (i.e. not built into the language). |
This is not true since Go 1.14: |
as I just opened a related feature request and to clarify, having co-routines being pre-emptive is kind of a contradiction in itself. co-routines are routines which can enter and exit at any point, basically making it possible to have many of them "in-flight" without having to resort to threads. this becomes when co-routines start inter-depending, which ould otherwise lead to deadlock and/or very iniefficient use of threads and with large-scale networking, where wasting a thread for each connection has been known to be a very bad design choice at least since apache was implemented :) |
so is there any plan to implement real coroutines, i.e. yieldable functions, that dont require threading? co-routines are there to enable concurrency, which is not the same as parallelism (threads) though the two interact very well |
I'd say the answer has two parts:
|
Imagine I’m building a chat app. If I use erlang, I can get about 500k-1M connections per node. If I use a language with threads, I can handle far fewer per node. This means it costs more to scale it, and the pubsub system is far more stressed because it has to handle a lot more traffic between nodes to handle the same traffic as erlang. yes, I could write code to imitate coroutine performance using threads, but I could also write code to handle my own memory. Thats not what V is about. If we rely on a library, I can accidentally starve other coroutines, wont get the same performance as go/erlang and will end up with the same confusing and disappointing concurrency story Ocaml is in right now. |
@atomkirk please read #1868 properly. It says V shall use only that many threads as there are processing units (CPU cores, FPGAs, ...) and all go routines will be fully preemptively multiplexed among these very few threads. Moreover, the plan is to make the number of threads follow sleeping and waking up of processing units in runtime (to not force a sleeping notebook with 32 CPU cores running currently only one CPU core to save energy to context-switch between 32 threads of a V app). It's a similar design as Go lang uses under the hood, but IMHO strictly better (due to the guaranteed full preemptiveness & support for power saving). So, please go ahead and read the thread (incl. all links and links in those links...). I hope all the concerns immediately become void by that. |
how would this preemption work? at which points will you yield a coroutine? usually this is done actively by the coroutine entering an async call and "waiting" (i.e. yielding) for the result. thats also the most efficient way. also how will you do this in a library? to do real context switching you need assembly. or for fully integrated coroutines like you have in c++ now you need compiler support. |
go lang most definitely will do coopoerative multitasking by having coroutines yield on channel action and async io (usually just networking, unless they also implemented async file io). so if you say "like go" you will have to do that, i.e. actively yield coroutines on these statements (channel actions, network io, mutexes, I think those are the important ones). |
the way to have efficient networking resolves around using epoll/select inside the coroutine scheduler and have it know what coroutine to wake up when a certain socket becomes readable, similarly with channels and mutexes |
Let me reiterate - please read the whole thread #1868 incl. all links recursively (depth 3 should be enough).
Partially yes - IMHO yielding under the hood will be done less aggressively than Go lang does, because there'll be the full preemptiveness, so presumably the inserted yields will be put only on critical places chosen based on true performance profiling analysis of representative apps (unlike in Go lang where they have no choice and have to put them really everywhere to make the language kind of work). |
what is "agressive" about cooperative yielding? its simply efficient because it will yield at exactly the points the thread would end up sleeping anyway |
also isnt the fact that just about any other language (even c) now has cooperative coroutines an indication that they are a good idea? |
Well, this supposes that programmers are dumb and will use 1 go routine per 1 request (be it a network request or any other sample from a high-rate stream). Which is one of the dumbest things one can do. Nobody from the Go lang nor Erlang world does this because Go routines in Go (and Erlang processes as well) are still extremely expensive (you can have only smaller millions of them which is by far not enough for a scalable app). Therefore I'd say V shall actually not make the scheduler this smart. But let's see, maybe someone will provide some measurements and data and in V 1.1 (which is years ahead IMHO) there'll be such a smart scheduler. But definitely not now for V 1.0 because it's a nonsense from my point of view.
At some point (if you have too many yields) it becomes less efficient than less frequent preemptying (and it has also other downsides - it increases code size a bit, it disallows good CPU-bound performance optimization, etc.). Please just finally find few hours to read the thread #1868 recursively (and maybe wait one more day to let the brain calmly absorb it all before proposing other concepts which actually are quite aging already). We'll be here, we won't run away 😉. This topic is not urgent and very old and many people smarter than me have put their thoughts in it - most of it documented in the #1868 thread and recursive links.
Again - there is a plan for coroutines (as part of the standard library, maybe even with some intrinsics). But IMHO it's lower priority. Feel free to make a PR with a potential API (some people already worked on that, but I can't find the links now quickly - just search for them yourself and ask e.g. on Discord). |
I am not convinced. How can it be more efficient to let a routine sleep (on a network select or waiting on a mutex or...) than yield it? The thread will idle and eventually be swapped out by the os kernel. Unless you are talking about spin locking. But even that you can model with proper co-routines. Also see this talk where Gor Nishanov applies the new co-routines to micro-optimizations to mask cache line latencies, so I beleive co-routines, at least the new ones that have compiler support, have been driven "all the way down". Also bear in mind that with the new compiler support for co-routines in later c compiler the compiler can even inline across and through yield points. |
Please really devote several hours to reading the whole thread #1868 incl. links. You'll learn among other things about Weave which nicely shows what the performance differences are - but don't forget we're talking about maxing out performance of multiple processing units, not just a single core. And btw. as I said, V will insert yields internally on important places (preferably according to measurements and not human guesses) like selects, mutexes, etc. But I suppose it'll be on a (much) lower number of places than the recent Go versions started to do (you can read about this also in one of the linked resources from the #1868 thread).
Thanks for the link. Gor explains a cool idea how to implement very lightweight coroutines which highly efficiently leverage modern CPU cache hierarchy. These nano-coroutines are unfortunately something V can't care about. Simply because they don't support scheduling across multiple processing units. In other words they do support only a single processing unit (i.e. only one thread) and based on Gor's explanation this can't be changed without losing some of their benefits. I'd even guess that e.g. Weave (which offers tasks which is just a slightly different abstraction for the very same coroutine concept) is about as fast as Gor's nano coroutines even if run only on a single core (despite Weave being designed to max out performance of many processing units with different processing powers). Feel free to test it and post your results here to let us reproduce them on more machines.
Inlining micro-optimization sound like a patch to a wrong abstraction. But yes, thanks to this I'd guess it'll catch up with "normal function calls" when it comes overhead on one core. |
I had a quick look at weave, i dont see how it applies to async operations like networking where you have to keep on juggling tasks because most of them simply cannro make progress at any given time due to waiting for io. I think you are fundamentally mixing up concurreny with parallelism. |
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
I found that the coroutine of v is directly called pthread_create. I think it is possible to add user-level coroutines, so that more energy can be injected. Perhaps can learn from golang's approach and add a runtime. . .
The text was updated successfully, but these errors were encountered: