-
Notifications
You must be signed in to change notification settings - Fork 537
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Op bunching 2: Bunch contiguous ops for data store in a batch - data store part #22840
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Coverage Summary
↓ packages.runtime.container-runtime.src:
Line Coverage Change: -0.11% Branch Coverage Change: 0.04%
Metric Name | Baseline coverage | PR coverage | Coverage Diff |
---|---|---|---|
Branch Coverage | 92.98% | 93.02% | ↑ 0.04% |
Line Coverage | 94.70% | 94.59% | ↓ -0.11% |
↑ packages.runtime.datastore.src:
Line Coverage Change: 0.17% Branch Coverage Change: 0.34%
Metric Name | Baseline coverage | PR coverage | Coverage Diff |
---|---|---|---|
Branch Coverage | 89.81% | 90.15% | ↑ 0.34% |
Line Coverage | 93.89% | 94.06% | ↑ 0.17% |
Baseline commit: 362f1ad
Baseline build: 302845
Happy Coding!!
Code coverage comparison check passed!!
⯅ @fluid-example/bundle-size-tests: +6.08 KB
Baseline commit: 55bed2d |
packages/runtime/datastore/api-report/datastore.legacy.alpha.api.md
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving for docs, with a few more suggestion below.
🔗 No broken links found! ✅ Your attention to detail is admirable. linkcheck output
|
…2841) ## Reviewer guidance This is part 3 or 3 of the op bunching feature. This part focuces on the changes in the DDS. [Part 1](#22839) and [part 2](#22840). ## Problem During op processing, container runtime sends ops one at a time to data stores to DDSes. If a DDS has received M contiguous ops as part of a batch, the DDS is called M times to process them individually. This has performance implications for some DDSes and they would benefit from receiving and processing these M ops together. Take shared tree for example: For each op received which has a sequenced commit, all the pending commits are processed by the rebaser. So, as the number of ops received grows, so does the processing of pending commits. The following example describes this clearly: Currently if a shared tree client has N pending commits which have yet to be sequenced, each time that client receives a sequenced commit authored by another client (an op), it will update each of its pending commits which takes at least O(N) work. Instead, if it receives M commits at once, it could do a single update pass on each pending commit instead of M per pending commit. It can compose the M commits together into a single change to update over, so it can potentially go from something like O (N * M) work to O (N + M) work with batching. ## Solution - op bunching The solution implemented here is a feature called "op bunching". With this feature, contiguous ops in a grouped op batch that belong to a data store / DDS will be bunched and sent to it in an array - The grouped op is sent as an `ISequencedMessageEnvelope` and the individual message `contents` in it are sent as an array along with the `clientSequenceNumber` and `localOpMetadata`. The container runtime will send bunch of contiguous ops for each data store to it. The data store will send bunch of contiguous ops for each DDS to it. The DDS can choose how to process these ops. Shared tree for instance, would compose the commits in all these ops and update pending commits with it. Bunching only contiguous ops for a data store / DDS in a batch preserves the behavior of processing ops in the sequence it was received. Couple of behavior changes to note: 1. Op events - An implication of this change is the timing of "op" events emitted by container runtime and data store runtime will change. Currently, these layers emit an "op" event immediately after an op is processed. With this change, an upper layer will only know when a bunch has been processed by a lower layer. So, it will emit "op" events for individual ops in the bunch after the entire bunch is processed. From my understanding, this should be fine because we do not provide any guarantee that the "op" event will be emitted immediately after an op is processed. These events will be emitted in order of op processing and (sometime) after the op is processed. Take delta manager / container runtime as an example. Delta manager sends an op for processing to container runtime and emits the "op" event. However, container runtime may choose to not process these ops immediately but save them until an entire batch is received. This change was made but was reverted due to some concerns not related to the topic discussed here - #21785. The chang here is similar to the above behavior where an upper layer doesn't know and shouldn't care what lower layers do with ops. 2. `metadata` property on message - With this PR, the metadata property is removed from a message before it's sent to data stores and DDS. This is because we now send one common message (the grouped op) and an array of contents. Individual messages within a grouped op have batch begin and end metadata but they are just added by the runtime to keep it like old batch messages. The data store and DDS don't care about it so removing them should be fine. [AB#20123](https://dev.azure.com/fluidframework/235294da-091d-4c29-84fc-cdfc3d90890b/_workitems/edit/20123)
…ect (#23836) This PR adds support for DDSes to process a 'bunch' of messages. Op bunching was added via the following PRs: #22839, #22840 and #22841. Added a `processMessagesCore` function to `SharedObjectCore` that DDSes can override to process a message bunch. This is marked optional today because it's a breaking change. It is similar to the `processCore` function and will replace it eventually. For now, `processCore` has been marked as deprecated. This change updates the relative ordering of the `pre-op` and `op` events emitted by the shared object. Previously, the `pre-op` event would be emitted before a message was processed and `op` event was emitted immediately afterwards. With this change, the ordering of these events w.r.t. to the message being processed is still the same. However, there may be other ops processed and other event fired in between. Note that the only guarantee these events provide is that the `pre-op` event is emitted before a message is procesed and the `op` event is processed after. So, this change doesn't break that guarantee.
Deprecating `process` on `FluidDataStoreRuntime`. It was deprecated on `IFluidDataStoreChannel` (and other places) as part of #22840 but missed deprecating it on `FluidDataStoreRuntime`.
Reviewer guidance
This is part 2 or 3 of the op bunching feature. This part focuces on the changes in data store. Part 1 - #22839
Problem
During op processing, container runtime sends ops one at a time to data stores to DDSes. If a DDS has received M contiguous ops as part of a batch, the DDS is called M times to process them individually. This has performance implications for some DDSes and they would benefit from receiving and processing these M ops together.
Take shared tree for example:
For each op received which has a sequenced commit, all the pending commits are processed by the rebaser. So, as the number of ops received grows, so does the processing of pending commits. The following example describes this clearly:
Currently if a shared tree client has N pending commits which have yet to be sequenced, each time that client receives a sequenced commit authored by another client (an op), it will update each of its pending commits which takes at least O(N) work.
Instead, if it receives M commits at once, it could do a single update pass on each pending commit instead of M per pending commit.
It can compose the M commits together into a single change to update over, so it can potentially go from something like O (N * M) work to O (N + M) work with batching.
Solution - op bunching
The solution implemented here is a feature called "op bunching".
With this feature, contiguous ops in a grouped op batch that belong to a data store / DDS will be bunched and sent to it in an array - The grouped op is sent as an
ISequencedMessageEnvelope
and the individual messagecontents
in it are sent as an array along with theclientSequenceNumber
andlocalOpMetadata
.The container runtime will send bunch of contiguous ops for each data store to it. The data store will send bunch of contiguous ops for each DDS to it. The DDS can choose how to process these ops. Shared tree for instance, would compose the commits in all these ops and update pending commits with it.
Bunching only contiguous ops for a data store / DDS in a batch preserves the behavior of processing ops in the sequence it was received.
Couple of behavior changes to note:
From my understanding, this should be fine because we do not provide any guarantee that the "op" event will be emitted immediately after an op is processed. These events will be emitted in order of op processing and (sometime) after the op is processed.
Take delta manager / container runtime as an example. Delta manager sends an op for processing to container runtime and emits the "op" event. However, container runtime may choose to not process these ops immediately but save them until an entire batch is received. This change was made but was reverted due to some concerns not related to the topic discussed here - Process ops by batches in remoteMessageProcessor and pendingStateManager #21785.
The chang here is similar to the above behavior where an upper layer doesn't know and shouldn't care what lower layers do with ops.
metadata
property on message - With this PR, the metadata property is removed from a message before it's sent to data stores and DDS. This is because we now send one common message (the grouped op) and an array of contents. Individual messages within a grouped op have batch begin and end metadata but they are just added by the runtime to keep it like old batch messages. The data store and DDS don't care about it so removing them should be fine.AB#20123