-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Breaking Change Request: Enable isolate groups by-default - Will result in changes to Performance Characteristics #46754
Comments
To clarify, the new lightweight isolate implementation now supports both AOT and JIT right? |
Right. |
The breaking change has been announced on here @mit-mit Could you help get any necessary approvals (since @franklinyow is out)? |
Have there been any measurements done on how much performance this might cost? I think espeically with the functions framework and Dart being more and more popular it might actually be become more important in the future (of course this is just my speculation). |
We have done measurements on worst case scenarios. For example when 8 threads are continuously building up data structures that live for a long time (that means the generational hypothesis - which most VMs optimize for - does not hold) on 10+ GB heap can lead to a 2x slowdown.
Firstly, it's unclear whether any of our existing users would run into any pain points in practice (so far we are not aware of any). We do have ideas about how this could be further optimized and might invest in that if we believe it is worthwhile.
Right now the VM team is not optimizing for large server use cases (i.e. 10-100s of cores and large amount of RAM) - because that is not how our users use Dart atm. That being said, one can use the VM to work well in this setting - e.g. by using many isolate groups (as mentioned in the mitigation section above)
To the best of my knowledge cloud functions are often ephemerally executed, requests are independent of each other and little global state is kept. Based on that I wouldn't say this change would negatively impact such cases (it may even benefit such use cases). |
Thanks for your detailed response! :) |
With spawning new isolates being at least 10x faster, potentially what we will see is that applications start start using short-lived/just-in-time-spawned isolates significantly more in addition or instead of larger long-running isolates. So in other words use of isolates could become more functions-oriented, similar to how compute in flutter is built. sendAndExit(sendPort, message) that is to be released in the future will speed up this flow even further. |
Here are numbers from benchmarks we added specifically to see the impact of this change. It was measured in standalone Dart VM (JIT and AOT) on an Intel CPU at commit 5c0466a:
We measured how long it takes for a mid-to-large application (dart2js in this case) to spawn a new isolate and what the additional base memory overhead is for such an isolate. (The helper isolate will run dart2js on a dart file) We can see that
Here we can see that the isolate receiving messages will no longer pay a O(n) cost, it is rather constant single-digit us. Together the send-and-receive is around 8x faster.
We can observe that sending bytes between isolates became between 1.5-2x faster. (Metric is runs/second, larger numbers are therefore better)
(Metric is runs/second, larger numbers are therefore better) We also measure json decoding on helper isolates (in x1 or x4 isolates). The isolates are receiving bytes, perform utf-8 decoding followed by json decoding and send the result back. We can observe
This shows us that if helper isolates allocate a lot of objects which die young the pause times in the main isolate are not affected much (1-2 ms). If the generational hypothesis doesn't hold (all objects survive - hypothetical worst case scenario) then the main isolate will have the same pause time as the helper isolate that triggers young collections (since young generation is collected with stop-the-world) - which is between 10-20 ms. Though for flutter this would look different, since it will use idle time between frames to cause GCs, therefore doing more young space collections (before it's full) therefore reducing the pause times (for this hypothetical worst case scenario). |
This is a very important "upgrade" to DartVM Isolate. I want to highlight the context of changes in performance:
Some questions?:
|
It will be interesting to |
It's important to test if Also, when the provided See package |
That is indeed a very interesting question, that we also thought about. In fact, right now we have an internal boolean flag that can be used to make The reasons for that are multifold:
So in summary:
As mentioned on the mitigations, one can always use @gmpassos It was a long explanation, but I hope it makes some sense? |
Thanks for the good answer! So, how about 2 types of Isolate modes, lightweight and fully isolated. Depending of the mode, the category of shared types between Isolates are broader or restricted. If the types sent between Isolates are controlled correctly, the issues goes away (correct me if I'm wrong). I haven't looked at the new code, but Note that I vote for a future where the features at #46623 are implemented. Another question:
BTW, nice job! This is a hard and very important work. |
@mkustermann Is there anywhere we can read on how isolates in an isolate group are distributed across CPU cores? |
That is precisely what happens already now:
Glad to receive the positive feedback 👍
It's not explicitly documented anywhere AFAIK. The Dart VM is quite flexible, so the answer depends: For example in Flutter (an embedder of the Dart VM), the flutter engine will decide the specific OS thread on which the UI isolate runs and how it processes messages (helper isolates are run by the VM). When an isolate is idle and receives a message, the VM will run it's message handler on a VM-internal thread pool. That means over the lifetime of an isolate, it may run on different OS threads. So far we have not had a need for thread-pinning support, but it may come up in the future. The operating system is responsible for mapping OS threads (e.g. |
@mkustermann if DartVM already have 2 modes of communication, will be good to be able to know the mode/capabilities of communication from the current Isolate. With the correct documentation and helpers to know the current status/mode, the developer won't have surprises and will be able to craft what he needs. I vote for:
About "thread-pinning": Note that most OS and devices tries to not overheat an specific core, so it rotates the "tasks" between cores to avoid that. |
Improving performance seems great to me. |
lgtm |
Marking approved |
Thank you all for the great discussions. The Breaking Change has been approved and we'll be performing this change sometime in the next couple of weeks (after current stable branch is cut)- allowing a long baking time on dev and beta branches. If there's any more feature requests, performance bugs or anything else, please file a new github issue under https://github.com/dart-lang/sdk/issues/new Last replies on this thread:
The documentation has recently been updated to remove some ambiguity. You can see the newest docs at SendPort.send. It lists what is always supported and what is supported if-and-only-if
Supporting The complexity and issues involved in supporting this makes us believe this is not a good choice (also considering users can use @gmpassos If you feel strongly about it, I would encourage you to file a new github issue as a feature request, then any discussion can continue there.
What I mean by thread pinning is that a given isolate is "pinned" to a specific OS thread (e.g. pthread). There are use cases where this is needed (e.g. interacting with C code that uses thread local storage). It doesn't mean that the OS thread is pinned to a specific CPU core. |
@mkustermann thanks for the response. About the "spawn in a separated group": Now I understand better the complexity to implement it. I think that "spawnUri" and "Platform.script" About OS Thread pinning, this can be very useful to avoid issues with "dart:ffi". Also will help integrations with C/C++ or existing compiled libraries that need that. Look the issues that Python has and the use of GIL (Global Interpreter Lock), that is mandatory, an approach that is totally wrong for me. Regards. |
Can we create/post some sample code for this? Also cc @RedBrogdon for devrel. |
That'd be nice! But please note that this won't be in stable until 2.15. |
The flag is now on by-default in all modes. Closing this issue.
We'll ensure there's good documentation by the time the stable is released. |
Issue #46754 Issue #36097 TEST=ci Change-Id: Ic0b1ecf88790576ae1f31b6a003b2175b9af1c66 Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/213343 Commit-Queue: Martin Kustermann <[email protected]> Reviewed-by: Kevin Moore <[email protected]>
Any chance there's a detailed doc on migration to
|
Generally speaking
Sorry about the crash. Would you mind opening up a new issue with hopefully some instructions on how to reproduce it?
|
@aam thanks for the clarifications! Incompatibility of spawnUri() with AOT effectively means you can't use it with Flutter... The above logs come from DartVM running unit tests and most probable cause is the kind of payload that was OK with legacy P.S.: I'm opening a separate issues regarding performance troubles with this breaking change shortly. |
To provide further clarification - conceptually when using
It would help if you could share command line of how the unit tests were launched and revision of dart sdk where you see this happening. It should not be happening. :-)
Okay, please cc me and @mkustermann on those. |
@mkustermann, you mention sharing String objects between Isolates of a same group ("allow sharing of objects (program structure, JITed code, constants as well as any String objects - in the future possibly also user-defined data structures)"). |
When sending messages (e.g. via That means if you e.g. spawn an isolate, which loads data from the internet and decodes the bytes to a string, you can send that string to the UI isolate in O(1) time and it will be sent by-pointer. We can do that because |
Ah ok, here I thought there was a way to access shared memory freely, I misunderstood. |
We do have some limited support using There have been some talks about allowing general shared (mutable) memory (as e.g. Java) - though we have no concrete plans atm to introduce that. |
@mtc-jed I'd love if you could try the |
Not an I needed to implement something similar and one way was through Take a look at:
Note that the shared pointer (and related memory segment) won't have any concurrency control (no MUTEX). So it only works for some scenarios, where one side only writes and the other side reads and is capable to know if the read operation was successful. If Another important thing about
Note that in some applications where we use |
@gmpassos Very cool, however this only works with a list of bytes. I am not aware of any way in Dart of casting any random user-defined object to its bytes. |
@mit-mit Running tests in 2.16.1, I get a linearly growning delay between the call to That being said, a pass by pointer would very much be appreciated, not only by myself but also by a lot of other people (the benefits of this for a local database could be tremendous, I believe). Here's how a data downloading would be with pointer passing : Without pointer passing, we can only :
I do not see, AFAIK, a reason to prevent the developers from implementing this use case. I of course have only barebones knownledge of the Dart VM and of the general philosophy applied when developping the language ; feel free to enlighten me. To note : Isolate spawning is apparently way less costly with 2.15's Isolate groups. So Isolate.exit() seems like a sustainable solution, but there's no denying it could be much better with pointer passing. |
You can use: https://pub.dev/packages/data_serializer And make your object implement: https://pub.dev/documentation/data_serializer/latest/data_serializer/Writable-class.html Example in the tests: https://github.com/gmpassos/data_serializer/blob/master/test/data_serializer_writable_test.dart |
@gmpassos This would mean I still have to deserialize my data in the main Isolate, which is exactly the cost I'm trying to avoid. |
Yes I agree. To have the original allocated memory delegated from one Isolate to another the only way now is with 'Isolate.exit' |
It's a language level design decision - allowing to pass arbitrary objects around creates shared memory with all associated issues and pitfalls. So a simpler programming model is gained by outlawing shared memory. That being said I suggest to move discussions about shared memory into other channels - it is offtopic here. If you want to outline specific issues you are facing, there is dart-lang/language#333 |
Maybe this issue isn't the right place for this discussion. Could I ask you to open new issues for feature requests / bug reports? Regarding Yes it's the escape from the sound and safe world. It allows unsafe access to shared writable C memory from multiple isolates. Using Especially regarding blocking synchronization mechanisms like locks, I'd like to mention that use of such must be done with care: Dart's eventloop based programming model is based on the fact that Dart code doesn't synchronously block for longer periods of time. This is especially important for Flutter apps where the UI isolate needs to be able to render animations with 60+fps. |
Can be a little bit off topic. But actually we are talking about shared memory as an alternative of the issues of how to send large amount of data or complex objects between Isolates. Actually we don't want to use shared memory, we prefer an elegant and native approach. The idea here is just to show how we are bypassing the performance issue, so the awesome job done with Isolate groups can achieve its full potential with some future improvements. Best regards. |
Intended change
We intend to enable isolate group sdk/issues/36097 support in the VM by-default.
This will make isolates spawned via
Isolate.spawn
run inside the same Isolate Group and therefore operate on the same heap, allowing sharing of various kinds of objects and allowing better communication.Intended change in behavior:
The intention is to
String
objects - in the future possibly also user-defined data structures)The justification/rationale for making the change
Get the improvements mentioned above to users. Enabling more data sharing of objects across isolates in the future.
The expected impact of this change
There are no functional differences. There will be changes in performance characteristics. Those changes will almost exclusively be positive.
The thread pool onto which all lightweight isolates are multiplexed onto is limited (around 10 atm), in order to ensure all threads executing on different cores have a big enough TLAB (thread local allocation buffer - a free chunk of memory from new space) to ensure fast bump allocation.
This means isolates will collaborate on garbage collections (and some other events like lazy JIT compilations, ...). As a consequence blocking GC operations (such as new space collections) will affect all isolates. The worst case pause time due to new space collections is unchanged, however heavily allocating isolates can impact other isolates.
In the common case where generational hypothesis holds (most objects die young) those collections continue being fast. Furthermore for Flutter specifically, any idle time on the UI thread is used to perform GCs, thereby also avoiding too long pause due to new space GCs. (The old space is mainly collected via concurrent marking & sweeping, thereby not stopping mutators)
The only existing use case that could be negatively impacted is existing apps for which many isolates are executing in parallel on many cores (e.g. big server applications).
3 important Dart customers (including flutter) have been opt'ing into this already for a longer period of time in AOT mode. So far we have only heard positive feedback from them (especially memory footprint reductions).
We expect the only use case that might be actually affected by this change is e.g. server customers that use isolates on many threads at the same time.
Clear steps for mitigating the change
For customers that use isolates on many threads at the same time (such running on large servers), the possible workaround is to use
Isolate.spawnUri()
on the same application - this will cause the VM to use an independent Isolate Group which gives the old behavior. Though communication is then restricted to json-like types.(see also go/ig-by-default)
The text was updated successfully, but these errors were encountered: