Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Breaking Change Request: Enable isolate groups by-default - Will result in changes to Performance Characteristics #46754

Closed
mkustermann opened this issue Jul 29, 2021 · 45 comments
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. breaking-change-request This tracks requests for feedback on breaking changes enhancement-breaking-change An enhancement which is breaking.

Comments

@mkustermann
Copy link
Member

mkustermann commented Jul 29, 2021

Intended change
We intend to enable isolate group sdk/issues/36097 support in the VM by-default.

This will make isolates spawned via Isolate.spawn run inside the same Isolate Group and therefore operate on the same heap, allowing sharing of various kinds of objects and allowing better communication.

Intended change in behavior:
The intention is to

  • make the per-isolate base memory overhead smaller (10x less RAM)
  • make isolates faster to spawn (10x faster spawn latency)
  • make isolates communicate faster (8x faster round-trip communication)
  • make receiver isolate of messages mostly non-blocking (removes O(n) receiver cost)
  • allow more rich communication between isolates (see sdk/issues/46623)
  • allow sharing of objects (program structure, JITed code, constants as well as any String objects - in the future possibly also user-defined data structures)
  • fix long-standing bugs that happen if isolates are used with the (currently non-atomic) hot-reload (e.g. flutter/issues/72195)
  • allow flutter to smoothly use multiple engines flutter.dev/docs/development/add-to-app/multiple-flutters

The justification/rationale for making the change
Get the improvements mentioned above to users. Enabling more data sharing of objects across isolates in the future.

The expected impact of this change
There are no functional differences. There will be changes in performance characteristics. Those changes will almost exclusively be positive.

The thread pool onto which all lightweight isolates are multiplexed onto is limited (around 10 atm), in order to ensure all threads executing on different cores have a big enough TLAB (thread local allocation buffer - a free chunk of memory from new space) to ensure fast bump allocation.

This means isolates will collaborate on garbage collections (and some other events like lazy JIT compilations, ...). As a consequence blocking GC operations (such as new space collections) will affect all isolates. The worst case pause time due to new space collections is unchanged, however heavily allocating isolates can impact other isolates.

In the common case where generational hypothesis holds (most objects die young) those collections continue being fast. Furthermore for Flutter specifically, any idle time on the UI thread is used to perform GCs, thereby also avoiding too long pause due to new space GCs. (The old space is mainly collected via concurrent marking & sweeping, thereby not stopping mutators)

The only existing use case that could be negatively impacted is existing apps for which many isolates are executing in parallel on many cores (e.g. big server applications).

3 important Dart customers (including flutter) have been opt'ing into this already for a longer period of time in AOT mode. So far we have only heard positive feedback from them (especially memory footprint reductions).

We expect the only use case that might be actually affected by this change is e.g. server customers that use isolates on many threads at the same time.

Clear steps for mitigating the change

For customers that use isolates on many threads at the same time (such running on large servers), the possible workaround is to use Isolate.spawnUri()on the same application - this will cause the VM to use an independent Isolate Group which gives the old behavior. Though communication is then restricted to json-like types.

(see also go/ig-by-default)

@mkustermann mkustermann added the breaking-change-request This tracks requests for feedback on breaking changes label Jul 29, 2021
@mkustermann
Copy link
Member Author

mkustermann commented Jul 29, 2021

/cc @a-siva @mraleph @aam (VM)
/cc @Hixie @xster @gaaclarke (Flutter)
/cc @vsmenon @mit-mit

Feel free to CC anyone else.

@xster
Copy link
Contributor

xster commented Jul 29, 2021

To clarify, the new lightweight isolate implementation now supports both AOT and JIT right?

@aam
Copy link
Contributor

aam commented Jul 29, 2021

To clarify, the new lightweight isolate implementation now supports both AOT and JIT right?

Right.

@a-siva a-siva added the area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. label Jul 29, 2021
@mkustermann
Copy link
Member Author

The breaking change has been announced on here

@mit-mit Could you help get any necessary approvals (since @franklinyow is out)?

@Jonas-Sander
Copy link

We expect the only use case that might be actually affected by this change is e.g. server customers that use isolates on many threads at the same time.

Have there been any measurements done on how much performance this might cost?
Additionally is this just a temporary pain point which will be made more performant in the future? Or is this something where the Dart team doesn't see Dart as an important use-case and thus won't optimize for?

I think espeically with the functions framework and Dart being more and more popular it might actually be become more important in the future (of course this is just my speculation).

@mkustermann
Copy link
Member Author

Have there been any measurements done on how much performance this might cost?

We have done measurements on worst case scenarios. For example when 8 threads are continuously building up data structures that live for a long time (that means the generational hypothesis - which most VMs optimize for - does not hold) on 10+ GB heap can lead to a 2x slowdown.

Additionally is this just a temporary pain point which will be made more performant in the future?

Firstly, it's unclear whether any of our existing users would run into any pain points in practice (so far we are not aware of any).

We do have ideas about how this could be further optimized and might invest in that if we believe it is worthwhile.

Or is this something where the Dart team doesn't see Dart as an important use-case and thus won't optimize for?

Right now the VM team is not optimizing for large server use cases (i.e. 10-100s of cores and large amount of RAM) - because that is not how our users use Dart atm.

That being said, one can use the VM to work well in this setting - e.g. by using many isolate groups (as mentioned in the mitigation section above)

I think especially with the functions framework and Dart being more and more popular it might actually be become more
important in the future (of course this is just my speculation).

To the best of my knowledge cloud functions are often ephemerally executed, requests are independent of each other and little global state is kept. Based on that I wouldn't say this change would negatively impact such cases (it may even benefit such use cases).

@Jonas-Sander
Copy link

Thanks for your detailed response! :)

@aam
Copy link
Contributor

aam commented Jul 30, 2021

Additionally is this just a temporary pain point which will be made more performant in the future? Or is this something where the Dart team doesn't see Dart as an important use-case and thus won't optimize for?

With spawning new isolates being at least 10x faster, potentially what we will see is that applications start start using short-lived/just-in-time-spawned isolates significantly more in addition or instead of larger long-running isolates. So in other words use of isolates could become more functions-oriented, similar to how compute in flutter is built. sendAndExit(sendPort, message) that is to be released in the future will speed up this flow even further.

@mkustermann
Copy link
Member Author

mkustermann commented Jul 30, 2021

Here are numbers from benchmarks we added specifically to see the impact of this change. It was measured in standalone Dart VM (JIT and AOT) on an Intel CPU at commit 5c0466a:

Benchmark JIT Before JIT After JIT Change AOT Before AOT After AOT Change
IsolateSpawn.Dart2JSToFinishRunningRMS 3137316.00 780292.63 -75.13 % 394717.94 396074.63 0.3437 %
IsolateSpawn.Dart2JSToStartRunningRMS 89621.59 574.28 -99.36 % 17494.70 259.79 -98.52 %
IsolateSpawnMemory.Dart2JSDeltaPeakProcessRss 972947456.00 782774272.00 -19.55 % 381440000.00 258723840.00 -32.17 %
IsolateSpawnMemory.Dart2JSDeltaRssOnStart 27043156.00 3941717.00 -85.42 % 18012842.00 1769472.00 -90.18 %
IsolateSpawnMemory.Dart2JSDeltaRssOnEnd 77070336.00 551594.00 -99.28 % 108882600.00 67689128.00 -37.83 %

We measured how long it takes for a mid-to-large application (dart2js in this case) to spawn a new isolate and what the additional base memory overhead is for such an isolate. (The helper isolate will run dart2js on a dart file)

We can see that

  • Spawning latency went from being O(n) in application size to being effectively a constant (JIT: 574 us, AOT: 260 us).
  • The spawned isolate can re-use JITed code and runs therefore much faster right from the start (it runs 4x faster).
  • The peak memory consumption reduces in JIT as well as AOT
  • The additional memory per isolate reduces by around 10x
Benchmark JIT Before JIT After JIT Change AOT Before AOT After AOT Change
SendPort.Receive.BinaryTree.2 5.2478 1.0162 -80.64 % 4.0722 1.2520 -69.25 %
SendPort.Receive.BinaryTree.4 12.357 1.0609 -91.41 % 6.3499 1.2568 -80.21 %
SendPort.Receive.BinaryTree.6 39.917 1.1565 -97.10 % 14.764 1.3500 -90.86 %
SendPort.Receive.BinaryTree.8 150.46 1.2812 -99.15 % 47.475 1.5075 -96.82 %
SendPort.Receive.BinaryTree.10 628.53 1.9064 -99.70 % 194.44 2.1566 -98.89 %
SendPort.Receive.BinaryTree.12 2548.28 4.2936 -99.83 % 794.96 4.3565 -99.45 %
SendPort.Receive.BinaryTree.14 10246.44 4.6576 -99.95 % 3256.48 4.6533 -99.86 %
SendPort.Receive.Json.400B 5.9036 1.1105 -81.19 % 6.3360 1.3150 -79.25 %
SendPort.Receive.Json.5KB 52.174 1.2674 -97.57 % 53.186 1.5660 -97.06 %
SendPort.Receive.Json.50KB 503.79 2.1735 -99.57 % 476.19 2.4452 -99.49 %
SendPort.Receive.Json.500KB 5194.21 5.1109 -99.90 % 4971.65 5.1561 -99.90 %
SendPort.Receive.Json.5MB 5579.90 19.9008 -99.98 % 3991.09 19.9308 -99.98 %
SendPort.Receive.Nop 0.84245 0.83353 -1.059 % 0.94660 0.95422 0.8056 %
SendPort.Send.BinaryTree.2 2.6803 0.94366 -64.79 % 2.7459 0.99268 -63.85 %
SendPort.Send.BinaryTree.4 7.5588 2.1675 -71.33 % 8.2269 2.6841 -67.37 %
SendPort.Send.BinaryTree.6 28.467 7.6131 -73.26 % 31.281 10.248 -67.24 %
SendPort.Send.BinaryTree.8 105.96 30.193 -71.51 % 117.82 30.515 -74.10 %
SendPort.Send.BinaryTree.10 419.48 110.05 -73.77 % 404.91 112.05 -72.33 %
SendPort.Send.BinaryTree.12 1738.89 544.90 -68.66 % 1739.35 523.80 -69.89 %
SendPort.Send.BinaryTree.14 7359.78 2336.84 -68.25 % 7435.21 2295.89 -69.12 %
SendPort.Send.Json.400B 5.3041 1.3875 -73.84 % 5.3811 1.2730 -76.34 %
SendPort.Send.Json.5KB 87.984 18.209 -79.30 % 83.766 18.214 -78.26 %
SendPort.Send.Json.50KB 837.65 177.17 -78.85 % 799.39 160.88 -79.88 %
SendPort.Send.Json.500KB 9134.69 2186.36 -76.07 % 8697.33 1972.85 -77.32 %
SendPort.Send.Json.5MB 113778.09 50449.93 -55.66 % 110797.18 46839.12 -57.73 %
SendPort.Send.Nop 0.30535 0.30177 -1.171 % 0.28601 0.30424 6.376 %

Here we can see that the isolate receiving messages will no longer pay a O(n) cost, it is rather constant single-digit us.
We can also see that sending json has become significantly faster, around 4x.

Together the send-and-receive is around 8x faster.

Benchmark JIT Before JIT After JIT Change AOT Before AOT After AOT Change
Isolate.SendReceiveBytes100KB 12903.66 25176.57 95.11 % 12683.20 25339.64 99.79 %
Isolate.SendReceiveBytes100MB 5.9831 7.3162 22.28 % 7.3251 11.073 51.17 %
Isolate.SendReceiveBytes10KB 36008.31 62362.08 73.19 % 35959.83 59863.60 66.47 %
Isolate.SendReceiveBytes10MB 59.690 55.972 -6.229 % 75.595 110.89 46.68 %
Isolate.SendReceiveBytes1KB 48756.69 80447.05 65.00 % 50621.82 78842.88 55.75 %
Isolate.SendReceiveBytes1MB 212.57 477.19 124.5 % 398.70 745.68 87.03 %

We can observe that sending bytes between isolates became between 1.5-2x faster.

(Metric is runs/second, larger numbers are therefore better)

Benchmark JIT Before JIT After JIT Change AOT Before AOT After AOT Change
IsolateJson.Decode100KBx1 1.7628 41.694 2265 % 19.562 38.126 94.90 %
IsolateJson.Decode100KBx4 1.2302 35.571 2791 % 11.319 34.181 202.0 %
IsolateJson.Decode1MBx1 1.0308 4.0661 294.5 % 2.3259 3.4675 49.08 %
IsolateJson.Decode1MBx4 0.67100 3.3272 395.9 % 1.3408 2.5559 90.64 %
IsolateJson.Decode250KBx1 1.4840 16.600 1019 % 8.3048 14.133 70.18 %
IsolateJson.Decode250KBx4 1.0702 11.765 999.4 % 5.1592 10.635 106.1 %
IsolateJson.Decode50KBx1 1.8633 23.011 1135 % 28.397 70.602 148.6 %
IsolateJson.Decode50KBx4 1.4497 61.380 4134 % 17.875 61.395 243.5 %
IsolateJson.SendAndExit_Decode100KBx1 1.7593 42.210 2299 % 19.747 38.317 94.04 %
IsolateJson.SendAndExit_Decode100KBx4 1.2352 36.599 2863 % 11.922 34.356 188.2 %
IsolateJson.SendAndExit_Decode1MBx1 1.0358 4.1429 300.0 % 2.3442 3.4788 48.40 %
IsolateJson.SendAndExit_Decode1MBx4 0.66169 2.3940 261.8 % 1.3342 2.4545 83.97 %
IsolateJson.SendAndExit_Decode250KBx1 1.4895 16.591 1014 % 8.2664 14.057 70.05 %
IsolateJson.SendAndExit_Decode250KBx4 1.0751 12.811 1092 % 5.2002 11.584 122.8 %
IsolateJson.SendAndExit_Decode50KBx1 1.9263 68.975 3481 % 29.038 68.695 136.6 %
IsolateJson.SendAndExit_Decode50KBx4 1.4529 61.024 4100 % 18.488 62.664 239.0 %

(Metric is runs/second, larger numbers are therefore better)

We also measure json decoding on helper isolates (in x1 or x4 isolates). The isolates are receiving bytes, perform utf-8 decoding followed by json decoding and send the result back.

We can observe

  • Due to re-using JITed code, the helper isolate is much faster in decoding json than before (where it had to re-JIT everything).

  • The faster isolate communication (which is only part of this benchmark's work) has led to a speedup of the benchmark by 1.5x-3x

Benchmark JIT Before JIT After JIT Change AOT Before AOT After AOT Change
EventLoopLatencyJson.Percentile95 1008.00 13815.0 1271 % 1006.00 11897.0 1083 %
EventLoopLatencyJson.Percentile99 1023.00 19878.0 1843 % 1015.00 18424.0 1715 %
EventLoopLatencyJson350KB.Percentile95 1010.00 1400.00 38.61 % 1005.00 1010.00 0.4975 %
EventLoopLatencyJson350KB.Percentile99 1018.00 2112.00 107.5 % 1012.00 1992.00 96.84 %

This shows us that if helper isolates allocate a lot of objects which die young the pause times in the main isolate are not affected much (1-2 ms). If the generational hypothesis doesn't hold (all objects survive - hypothetical worst case scenario) then the main isolate will have the same pause time as the helper isolate that triggers young collections (since young generation is collected with stop-the-world) - which is between 10-20 ms.

Though for flutter this would look different, since it will use idle time between frames to cause GCs, therefore doing more young space collections (before it's full) therefore reducing the pause times (for this hypothetical worst case scenario).

@gmpassos
Copy link
Contributor

gmpassos commented Jul 30, 2021

This is a very important "upgrade" to DartVM Isolate.

I want to highlight the context of changes in performance:

  • Isolates in the same group will collaborate with the same GC:

    • Since the amount of shared memory between Isolates is increased (what reduces the total number of Objects), the total CPU time of GC is reduced (if compared with multiple Isolates in separated groups).
    • Total JIT time is reduced (if compared with multiple Isolates in separated groups).
  • The current DartVM Isolate situation (Dart 2.13.4) makes impractical to spawn a high number of Isolates, due to bottlenecks (naturals to the current situation) in memory, GC and JIT.

    • The new approach points a theoretical decrease in performance for servers with a high number of Isolates, but this is already difficult with the current situation. The share of memory, GC and JIT can actually improve the capacity of Isolates.
  • The gain in SendPort performance impact.

    • The main trade-off for message based parallelism (Isolates) is the time to serialize, send, receive and deserialize the messages. Sometimes this trade-off makes impractical to use multiple Isolates, since the time to send a message is near the time to compute the related task.
    • The significant improvement in SendPort increases the cases where is interesting to use a DartVM Isolate. The current Isolate performance actually makes many scenarios impractical for parallelism in the current DartVM.

Some questions?:

  • Will be possible to force a new spawned Isolate to be in a different group?
    • This can be interesting to craft a solution that separates GCs.
    • This feature is important not only for spawnUri, but for any kind of entrypoint (normal Isolate.spawn).

@aam
Copy link
Contributor

aam commented Jul 30, 2021

Will be possible to force a new spawned Isolate to be in a different group?

Isolate spawnUri (unlike Isolate spawn) spawns new isolate in its own isolate group.

@gmpassos
Copy link
Contributor

It will be interesting to spawn a normal entrypoint/function (not spwanUri) in a different group, to allow specific optimizations for some solutions.

@gmpassos
Copy link
Contributor

gmpassos commented Jul 30, 2021

It's important to test if spawnUri, with a different projectPackageConfig still works well with all this changes.

Also, when the provided projectPackageConfig is the same of the current Isolate, this should be treated as the same "group" (if spawnUri have the ability to spawn in a current group).

See package dart_spawner for use cases:
https://pub.dev/packages/dart_spawner

@mkustermann
Copy link
Member Author

Will be possible to force a new spawned Isolate to be in a different group?
This can be interesting to craft a solution that separates GCs.
This feature is important not only for spawnUri, but for any kind of entrypoint (normal Isolate.spawn).

That is indeed a very interesting question, that we also thought about.

In fact, right now we have an internal boolean flag that can be used to make Isolate.spawn(<entry>, ...) spawn <entry> in a newly created isolate group (basically the old behavior). We intentionally did not expose this in the API.

The reasons for that are multifold:

  • There are current limitations (which we intend to lift with this change, see sdk/issues/46623) of what isolates can send to each other. Those limitations make using isolates often hard or cumbersome (e.g. one cannot use a closure function as an <entry>). Those restrictions are currently in-place mostly because they are quite hard to implement for our current share-nothing isolates.
    => If we allowed Isolate.spawn() to spawn new isolate groups, they would be more restricted in what they can communicate. Effectively leaving the open feature requests in sdk/issues/46623 unaddressed for that use case.

  • Isolates created with Isolate.spawn(<entry>) are today share-nothing and run in different isolate groups. Yet we still allow them communicating user-defined objects with each other.
    That poses a problem for our very popular hot-reload development feature: A hot-reload acts on one isolate group only. So there's no way to atomically perform a reload of multiple isolates (if they are of different isolate groups). Right now developer tools try to work around that by applying the same program change to all alive isolate groups (and hoping that the change gets accepted by all of them or none as well as hoping that no isolate creations are in-progress). It's a best effort that if fails can lead to crashes.
    Furthermore after hot-reloads are performed and new isolates are spawned, they would need to be created by loading the initial program and applying all past reload changes. It would need to do so in a guaranteed way before that isolate interacts with others (which is hard to guarantee). It would also require the VM to keep the initial program as well as all program diffs (which can be quite big, because those program diffs are currently represented in a very course-grained way [much bigger than needed]) in-memory indefinitly, therefore effectively creating a memory leak.
    => By making Isolate.spawn() only spawn lightweight isolates within the same group, we fix all of those issues.
    => By still allowing Isolate.spawn() to spawn into new groups we leave those existing issues unfixed.

So in summary:

Isolate.spawn() is the mechanism to use to create isolates from the exact same (as well as possibly hot-reloaded) application code. We want to allow rich message exchanges between them (including user defined classes, closures, ...). It currently has a lot of issues which we are trying to solve with this work on lightweight isolates.

Isolate.spawnUri()is our mechanism to use to create isolate from possibly different application code (possibly the same app, but different hot-reloaded state). They will live in their own newly created isolate group. We intentionally restrict communication between such isolate groups to JSON-like data (no user-defined classes, closures, ...) - because there is no guarantee that the code in the spawner and spawnee isolate are compatible.

As mentioned on the mitigations, one can always use Isolate.spawnUri() on the same code as the original isolate, thereby achieving the goal of a separate isolate group, separate heap and independent GC - but one will have to accept the limited communication.

@gmpassos It was a long explanation, but I hope it makes some sense?

@gmpassos
Copy link
Contributor

gmpassos commented Jul 31, 2021

Thanks for the good answer!

So, how about 2 types of Isolate modes, lightweight and fully isolated. Depending of the mode, the category of shared types between Isolates are broader or restricted. If the types sent between Isolates are controlled correctly, the issues goes away (correct me if I'm wrong).

I haven't looked at the new code, but SendPort will need to control if they are in the same group, to define the correct category of types to be shared, or issues can happen.

Note that I vote for a future where the features at #46623 are implemented.

Another question:

  • How SendPort will work with this new implementation when Isolate.spawnUri is used? (It seems that you already have 2 modes).

BTW, nice job! This is a hard and very important work.

@mnordine
Copy link
Contributor

mnordine commented Aug 2, 2021

@mkustermann Is there anywhere we can read on how isolates in an isolate group are distributed across CPU cores?

@mkustermann
Copy link
Member Author

So, how about 2 types of Isolate modes, lightweight and fully isolated. Depending of the mode, the category of shared types
between Isolates are broader or restricted. If the types sent between Isolates are controlled correctly, the issues goes away
(correct me if I'm wrong).

That is precisely what happens already now:

  • Isolates spawned via Isolate.spawn() will run same code and allow sending rich user-defined data structures.
  • Isolates spawned via Isolate.spawnUri() will run (possibly different) code and only allow json-like data to be transferred.

SendPort.send() knows the destination and will apply appropriate validation on the transitive object graph (it does that already now).

BTW, nice job! This is a hard and very important work.

Glad to receive the positive feedback 👍

@mkustermann Is there anywhere we can read on how isolates in an isolate group are distributed across CPU cores?

It's not explicitly documented anywhere AFAIK.

The Dart VM is quite flexible, so the answer depends: For example in Flutter (an embedder of the Dart VM), the flutter engine will decide the specific OS thread on which the UI isolate runs and how it processes messages (helper isolates are run by the VM). When an isolate is idle and receives a message, the VM will run it's message handler on a VM-internal thread pool.

That means over the lifetime of an isolate, it may run on different OS threads. So far we have not had a need for thread-pinning support, but it may come up in the future.

The operating system is responsible for mapping OS threads (e.g. pthreads on linux) onto CPU cores. Also there a OS thread can - over it's lifetime - be running on different cores - depending on the OS scheduler. We don't do any core-pinning in the VM

@gmpassos
Copy link
Contributor

gmpassos commented Aug 2, 2021

@mkustermann if DartVM already have 2 modes of communication, will be good to be able to know the mode/capabilities of communication from the current Isolate.

With the correct documentation and helpers to know the current status/mode, the developer won't have surprises and will be able to craft what he needs.

I vote for:

  • spawnUri: Isolate in a different group (separated JIT and GC).
  • spawn: 2 modes
    • Sibling: same group (same JIT and GC) with broader types for SendPort.
    • Colleague: separated groups (separated JIT and GC) with a limited types for SendPort.
  • the mode names could be better 😎

About "thread-pinning":

Note that most OS and devices tries to not overheat an specific core, so it rotates the "tasks" between cores to avoid that.

@mit-mit
Copy link
Member

mit-mit commented Aug 3, 2021

@Hixie @vsmenon can you approve this breaking change request?

@Hixie
Copy link
Contributor

Hixie commented Aug 6, 2021

Improving performance seems great to me.

@vsmenon
Copy link
Member

vsmenon commented Aug 6, 2021

lgtm

@mit-mit mit-mit added the enhancement-breaking-change An enhancement which is breaking. label Aug 9, 2021
@mit-mit
Copy link
Member

mit-mit commented Aug 9, 2021

Marking approved

@mkustermann
Copy link
Member Author

Thank you all for the great discussions. The Breaking Change has been approved and we'll be performing this change sometime in the next couple of weeks (after current stable branch is cut)- allowing a long baking time on dev and beta branches.

If there's any more feature requests, performance bugs or anything else, please file a new github issue under https://github.com/dart-lang/sdk/issues/new

Last replies on this thread:

@mkustermann if DartVM already have 2 modes of communication, will be good to be able to know the mode/capabilities of
communication from the current Isolate.

With the correct documentation and helpers to know the current status/mode, the developer won't have surprises and will be
able to craft what he needs.

The documentation has recently been updated to remove some ambiguity. You can see the newest docs at SendPort.send. It lists what is always supported and what is supported if-and-only-if Isolate.spawn was used.

I vote for:

  • spawnUri: Isolate in a different group (separated JIT and GC).
  • spawn: 2 modes
    Sibling: same group (same JIT and GC) with broader types for SendPort.
    Colleague: separated groups (separated JIT and GC) with a limited types for SendPort.

Supporting Isolate.spawn() into new groups has its issues. As outlined above it is especially problematic with hot-reload - one could spawn before / after reload and the new isolate group would need to have before/after program state. That might lead to memory leaks of hot-reload diffs.

The complexity and issues involved in supporting this makes us believe this is not a good choice (also considering users can use Isolate.spawnUri - although a little less convenient). If there is really strong demand (with actual real world use cases where the Sibling solution is insufficient) for this, we may reconsider this.

@gmpassos If you feel strongly about it, I would encourage you to file a new github issue as a feature request, then any discussion can continue there.

About "thread-pinning":
Note that most OS and devices tries to not overheat an specific core, so it rotates the "tasks" between cores to avoid that.

What I mean by thread pinning is that a given isolate is "pinned" to a specific OS thread (e.g. pthread). There are use cases where this is needed (e.g. interacting with C code that uses thread local storage). It doesn't mean that the OS thread is pinned to a specific CPU core.

@gmpassos
Copy link
Contributor

@mkustermann thanks for the response.

About the "spawn in a separated group":

Now I understand better the complexity to implement it. I think that "spawnUri" and "Platform.script"
(https://api.dartlang.org/stable/dart-io/Platform/script.html) can resolve most use cases when lightweight Isolates goes to production.

About OS Thread pinning, this can be very useful to avoid issues with "dart:ffi". Also will help integrations with C/C++ or existing compiled libraries that need that.

Look the issues that Python has and the use of GIL (Global Interpreter Lock), that is mandatory, an approach that is totally wrong for me.

Regards.

@xster
Copy link
Contributor

xster commented Aug 15, 2021

Can we create/post some sample code for this? Also cc @RedBrogdon for devrel.

@mit-mit
Copy link
Member

mit-mit commented Aug 16, 2021

Can we create/post some sample code for this?

That'd be nice! But please note that this won't be in stable until 2.15.

@mkustermann
Copy link
Member Author

The flag is now on by-default in all modes. Closing this issue.

Can we create/post some sample code for this? Also cc @RedBrogdon for devrel.

We'll ensure there's good documentation by the time the stable is released.

copybara-service bot pushed a commit that referenced this issue Sep 22, 2021
Issue #46754
Issue #36097

TEST=ci

Change-Id: Ic0b1ecf88790576ae1f31b6a003b2175b9af1c66
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/213343
Commit-Queue: Martin Kustermann <[email protected]>
Reviewed-by: Kevin Moore <[email protected]>
@maxim-saplin
Copy link

maxim-saplin commented Nov 10, 2021

Any chance there's a detailed doc on migration to Isolate.spawnUir() to preserve legacy behaviour of isolates? While refactoring my code I've managed to launch isolates via this method yet there's native crash, most likely on the main isolate side when receiving messages from spawned isolates via SendPort/ReceivePort it there's little I was able to fin on the internet that might help with troubleshooting:

../../runtime/vm/message_snapshot.cc: 557: error: expected: !cls.IsNull()
version=2.14.4 (stable) (Wed Oct 13 11:11:32 2021 +0200) on "macos_x64"
pid=80103, thread=13323, isolate_group=main(0x7f9a58926000), isolate=main(0x7f9a5892b000)
isolate_instructions=10bdf30a0, vm_instructions=10bdf30a0
  pc 0x000000010c05b154 fp 0x000070000a3f3b00 dart::Profiler::DumpStackTrace(void*)+0x64
  pc 0x000000010bdf3274 fp 0x000070000a3f3be0 dart::Assert::Fail(char const*, ...)+0x84
  pc 0x000000010bfbf71a fp 0x000070000a3f3c40 dart::ReadApiMessage(dart::Zone*, dart::Message*)+0x896a
  pc 0x000000010bfb6334 fp 0x000070000a3f3cb0 dart::MessageDeserializer::Deserialize()+0x274
  pc 0x000000010bfb6d6f fp 0x000070000a3f3d00 dart::ReadMessage(dart::Thread*, dart::Message*)+0x5f

  pc 0x000000010bf849c9 fp 0x000070000a3f3de0 dart::IsolateMessageHandler::HandleMessage(std::__2::unique_ptr<dart::Message, std::__2::default_delete<dart::Message> >)+0x1a9
  pc 0x000000010bfb1d4c fp 0x000070000a3f3e50 dart::MessageHandler::HandleMessages(dart::MonitorLocker*, bool, bool)+0x12c
  pc 0x000000010bfb247f fp 0x000070000a3f3eb0 dart::MessageHandler::TaskCallback()+0x1df
  pc 0x000000010c0e5bd8 fp 0x000070000a3f3f30 dart::ThreadPool::WorkerLoop(dart::ThreadPool::Worker*)+0x148
  pc 0x000000010c0e603d fp 0x000070000a3f3f60 dart::ThreadPool::Worker::Main(unsigned long)+0x5d
  pc 0x000000010c055b1f fp 0x000070000a3f3fb0 dart::OSThread::GetMaxStackSize()+0xaf
  pc 0x00007ff818400514 fp 0x000070000a3f3fd0 _pthread_start+0x7d
  pc 0x00007ff8183fc02f fp 0x000070000a3f3ff0 thread_start+0xf
-- End of DumpStackTrace

@aam
Copy link
Contributor

aam commented Nov 10, 2021

Any chance there's a detailed doc on migration to Isolate.spawnUir() to preserve legacy behaviour of isolates?

Generally speaking spawnUri'ed isolates have limitations regarding what can be sent to them that have to be worked around(https://api.dart.dev/dev/2.16.0-0.0.dev/dart-isolate/SendPort/send.html), they can not be used as a direct replacement of spawn'ed isolates(lightweight or legacy heavyweight). Also note that spawnUri is not supported in AOT configuration

While refactoring my code I've managed to launch isolates via this method yet there's native crash, most likely on the main isolate side when receiving messages from spawned isolates via SendPort/ReceivePort it there's little I was able to fin on the internet that might help with troubleshooting:

Sorry about the crash. Would you mind opening up a new issue with hopefully some instructions on how to reproduce it?

../../runtime/vm/message_snapshot.cc: 557: error: expected: !cls.IsNull()
version=2.14.4 (stable) (Wed Oct 13 11:11:32 2021 +0200) on "macos_x64"
pid=80103, thread=13323, isolate_group=main(0x7f9a58926000), isolate=main(0x7f9a5892b000)
isolate_instructions=10bdf30a0, vm_instructions=10bdf30a0
  pc 0x000000010c05b154 fp 0x000070000a3f3b00 dart::Profiler::DumpStackTrace(void*)+0x64
  pc 0x000000010bdf3274 fp 0x000070000a3f3be0 dart::Assert::Fail(char const*, ...)+0x84
  pc 0x000000010bfbf71a fp 0x000070000a3f3c40 dart::ReadApiMessage(dart::Zone*, dart::Message*)+0x896a
  pc 0x000000010bfb6334 fp 0x000070000a3f3cb0 dart::MessageDeserializer::Deserialize()+0x274
  pc 0x000000010bfb6d6f fp 0x000070000a3f3d00 dart::ReadMessage(dart::Thread*, dart::Message*)+0x5f

  pc 0x000000010bf849c9 fp 0x000070000a3f3de0 dart::IsolateMessageHandler::HandleMessage(std::__2::unique_ptr<dart::Message, std::__2::default_delete<dart::Message> >)+0x1a9
  pc 0x000000010bfb1d4c fp 0x000070000a3f3e50 dart::MessageHandler::HandleMessages(dart::MonitorLocker*, bool, bool)+0x12c
  pc 0x000000010bfb247f fp 0x000070000a3f3eb0 dart::MessageHandler::TaskCallback()+0x1df
  pc 0x000000010c0e5bd8 fp 0x000070000a3f3f30 dart::ThreadPool::WorkerLoop(dart::ThreadPool::Worker*)+0x148
  pc 0x000000010c0e603d fp 0x000070000a3f3f60 dart::ThreadPool::Worker::Main(unsigned long)+0x5d
  pc 0x000000010c055b1f fp 0x000070000a3f3fb0 dart::OSThread::GetMaxStackSize()+0xaf
  pc 0x00007ff818400514 fp 0x000070000a3f3fd0 _pthread_start+0x7d
  pc 0x00007ff8183fc02f fp 0x000070000a3f3ff0 thread_start+0xf
-- End of DumpStackTrace

@maxim-saplin
Copy link

maxim-saplin commented Nov 10, 2021

@aam thanks for the clarifications! Incompatibility of spawnUri() with AOT effectively means you can't use it with Flutter...

The above logs come from DartVM running unit tests and most probable cause is the kind of payload that was OK with legacy spawn() is now not supported in spawnUri() - don't think it is worth a separate Issue.

P.S.: I'm opening a separate issues regarding performance troubles with this breaking change shortly.
P.P.S: Having some toggle for old heavyweight isolates would be a remedy for my app.

@aam
Copy link
Contributor

aam commented Nov 10, 2021

@aam thanks for the clarifications! Incompatibility of spawnUri() with AOT effectively means you can't use it with Flutter...

To provide further clarification - conceptually when using spawnUri in AOT you would have to point at dart vm snapshot, rather than .dart-source code. If you do that, dart vm should be able to spawn new isolate group from the snapshot you provided. This has not been well documented or provisioned in [flutter, for example] AOT build flow. Basically those aot vm snapshots have to be built/prepared ahead of time, build/distribution setup has to ensure that children snapshots are compatible with parent snapshots.
Before diving deeper into this it would be helpful to understand the use case that requires spawnUri(in flutter or anywhere else), rather than spawn.

The above logs come from DartVM running unit tests and most probable cause is the kind of payload that was OK with legacy spawn() is now not supported in spawnUri() - don't think it is worth a separate Issue.

It would help if you could share command line of how the unit tests were launched and revision of dart sdk where you see this happening. It should not be happening. :-)

P.S.: I'm opening a separate issues regarding performance troubles with this breaking change shortly. P.P.S: Having some toggle for old heavyweight isolates would be a remedy for my app.

Okay, please cc me and @mkustermann on those.

@mtc-jed
Copy link

mtc-jed commented Mar 3, 2022

@mkustermann, you mention sharing String objects between Isolates of a same group ("allow sharing of objects (program structure, JITed code, constants as well as any String objects - in the future possibly also user-defined data structures)").
How is this done ? Is this only internal to the engine ?

@mkustermann
Copy link
Member Author

... you mention sharing String objects between Isolates of a same group. ... How is this done ? Is this only internal to the engine ?

When sending messages (e.g. via SendPort.send(<message>)) to other isolates that were spawned using Isolate.spawn() (or higher-level wrappers, e.g. flutter's compute() function) the <message> graph is transitively copied, but certain objects in there are not copied but shared, that includes String objects.

That means if you e.g. spawn an isolate, which loads data from the internet and decodes the bytes to a string, you can send that string to the UI isolate in O(1) time and it will be sent by-pointer.

We can do that because String objects (as some other objects) are transitively immutable.

@mtc-jed
Copy link

mtc-jed commented Mar 3, 2022

Ah ok, here I thought there was a way to access shared memory freely, I misunderstood.
Is there any plans of adding a way for pointers to any object to be transmissible ? This would be highly useful when downloading big json payloads (thousands of objects).

@mkustermann
Copy link
Member Author

Is there any plans of adding a way for pointers to any object to be transmissible ? This would be highly useful when downloading big json payloads (thousands of objects).

We do have some limited support using Isolate.exit() it avoids a transitive copy by exiting the current isolate and giving the message to the receiver isolate by-pointer. Though it still performs a possibly O(n) verification pass on the sender side.

There have been some talks about allowing general shared (mutable) memory (as e.g. Java) - though we have no concrete plans atm to introduce that.

@mit-mit
Copy link
Member

mit-mit commented Mar 3, 2022

@mtc-jed I'd love if you could try the Isolate.exit() approach Martin mentions, and then give us your feedback on whether that worked or not. As Martin details, the verification pass is on the sender side, so that should not cause any slowdown of your UI on the main isolate.

@gmpassos
Copy link
Contributor

gmpassos commented Mar 3, 2022

Not an Isolate solution, but still a shared pointer solution for the Dart VM:

I needed to implement something similar and one way was through dart:ffi.

Take a look at:

Note that the shared pointer (and related memory segment) won't have any concurrency control (no MUTEX). So it only works for some scenarios, where one side only writes and the other side reads and is capable to know if the read operation was successful.

If dart:ffi had some MUTEX control to access a pointer's bytes it will be possible to do a lot of interesting things.

Another important thing about Isolate communication:

  • I didn't known that String (an immutable object) won't be copied between Isolates. Maybe this subject requires some documentation.
  • Are const instances copied between Isolates? How this is actually allocated in the VM between Isolates?
  • Immutable Lists, Sets and Maps could have some way to create them as immutable and shared/"uncopied" between Isolates. For now we only have immutable views of this collections, not an actual immutable object like String.

Note that in some applications where we use Isolates to use all the cores of the computer the time that it takes to return the data is significant, but we are not using the Isolate.exit approach since we need to keep the Isolate running, since won't be efficient to recreate it and send all the initial data to be processed. Isolate.exit is a nice strategy but it's an antagonist of the bootstrap of a task in a new Isolate.

@mtc-jed
Copy link

mtc-jed commented Mar 4, 2022

@gmpassos Very cool, however this only works with a list of bytes. I am not aware of any way in Dart of casting any random user-defined object to its bytes.

@mtc-jed
Copy link

mtc-jed commented Mar 4, 2022

@mit-mit Running tests in 2.16.1, I get a linearly growning delay between the call to Isolate.exit() and the running of a print() in the calling Isolate. This seems in line with the behaviour described by @mkustermann.
I will probably use the Isolate.exit() method for my particular use case (one-time download of a bunch of data, in order for the app to be able function without an internet connection), since it is clearly adapted.

That being said, a pass by pointer would very much be appreciated, not only by myself but also by a lot of other people (the benefits of this for a local database could be tremendous, I believe).
The question that's bugging me is : What is gained by preventing developpers frop passing pointers to other Isolates ?

Here's how a data downloading would be with pointer passing :
I start my Flutter app. An Isolate is spawned to handle data downloading, and deserialization to a user-defined object.
I order my Isolate to download data, and the Isolate responds after a time with a pointer to the object.
Repeat however many times you want.
Cost to the main Isolate : 1 Isolate.spawn() at app startup.

Without pointer passing, we can only :

  • Pass by value. The main Isolate incurres the cost of deserialization.
  • Use Isolate.exit(). The main Isolate incurres the cost of spawning a new Isolate for every download.

I do not see, AFAIK, a reason to prevent the developers from implementing this use case. I of course have only barebones knownledge of the Dart VM and of the general philosophy applied when developping the language ; feel free to enlighten me.
I am of the opinion that developers should be given as much freedom as possible, provided the right tools to manage that freedom are available. In addition to this, implementation of such a solution seems trivial given the work already done for Isolate.exit() is closely related (I could be 100% wrong on this).

To note : Isolate spawning is apparently way less costly with 2.15's Isolate groups. So Isolate.exit() seems like a sustainable solution, but there's no denying it could be much better with pointer passing.

@gmpassos
Copy link
Contributor

gmpassos commented Mar 4, 2022

@mtc-jed
Copy link

mtc-jed commented Mar 4, 2022

@gmpassos This would mean I still have to deserialize my data in the main Isolate, which is exactly the cost I'm trying to avoid.

@gmpassos
Copy link
Contributor

gmpassos commented Mar 4, 2022

Yes I agree. To have the original allocated memory delegated from one Isolate to another the only way now is with 'Isolate.exit'

@mraleph
Copy link
Member

mraleph commented Mar 4, 2022

The question that's bugging me is : What is gained by preventing developpers frop passing pointers to other Isolates ?

It's a language level design decision - allowing to pass arbitrary objects around creates shared memory with all associated issues and pitfalls. So a simpler programming model is gained by outlawing shared memory.

That being said I suggest to move discussions about shared memory into other channels - it is offtopic here.

If you want to outline specific issues you are facing, there is dart-lang/language#333

@mkustermann
Copy link
Member Author

Maybe this issue isn't the right place for this discussion. Could I ask you to open new issues for feature requests / bug reports?

Regarding dart:ffi

Yes it's the escape from the sound and safe world. It allows unsafe access to shared writable C memory from multiple isolates.

Using dart:ffi it is also possible to call into C for auxiliary things (e.g. acquire/release locks, release/acquire memory order, fences, atomics, etc.) Maybe we will eventually provide some of those primitives in dart:ffi itself instead of needing to call out to C code.

Especially regarding blocking synchronization mechanisms like locks, I'd like to mention that use of such must be done with care: Dart's eventloop based programming model is based on the fact that Dart code doesn't synchronously block for longer periods of time. This is especially important for Flutter apps where the UI isolate needs to be able to render animations with 60+fps.

@gmpassos
Copy link
Contributor

gmpassos commented Mar 4, 2022

That being said I suggest to move discussions about shared memory into other channels - it is offtopic here.

Can be a little bit off topic. But actually we are talking about shared memory as an alternative of the issues of how to send large amount of data or complex objects between Isolates. Actually we don't want to use shared memory, we prefer an elegant and native approach. The idea here is just to show how we are bypassing the performance issue, so the awesome job done with Isolate groups can achieve its full potential with some future improvements.

Best regards.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. breaking-change-request This tracks requests for feedback on breaking changes enhancement-breaking-change An enhancement which is breaking.
Projects
None yet
Development

No branches or pull requests