Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ContainerRuntime: Process incoming batches op-by-op instead of waiting for the whole batch #22508

Merged
merged 5 commits into from
Sep 16, 2024

Conversation

markfields
Copy link
Member

@markfields markfields commented Sep 13, 2024

Description

Fixes AB#15221

Why

We are concerned that holding batch messages in ContainerRuntime even while DeltaManager advances its tracked sequence numbers through the batch could have unintended consequences. So this PR restores the old behavior of processing each message in a batch one-by-one, rather than holding until the whole batch arrives.

Note that there's no change in behavior here for Grouped Batches.

How

PR #21785 switched the RemoteMessageProcessor from returning ungrouped batch ops as they arrived, to holding them and finally returning the whole batch once the last arrived. The downstream code was also updated to take whole batches, whereas before it would take individual messages and use the batch metadata to detect batch start/end.

Too many other changes were made after that PR to straight revert it. Logic was added throughout CR and PSM that looks at info about that batch which is found on the first message in the batch. So we can reverse the change and process one-at-a-time, but we need a way to carry around that "batch start info" with the first message in the batch.

So we are now modeling the result that RMP yields as one of three cases:

  • A full batch of messages (could be from a single-message batch or a Grouped Batch)
  • The first message of a multi-message batch
  • The next message in a multi-message batch

The first two cases include the "batch start info" needed for the recent Offline work. The third case just indicates whether it's the last message or not.

#22501 added some of the necessary structure, introducing the type for "batch start info" and updating the downstream code to use that instead of reading it off the old "Inbound Batch" type. This PR now adds those other two cases to the RMP return type and handles processing them throughout CR and PSM.

Reviewer Guidance

Planning to port back to 2.2

@github-actions github-actions bot added area: runtime Runtime related issues base: main PRs targeted against main branch labels Sep 13, 2024
@msfluid-bot
Copy link
Collaborator

msfluid-bot commented Sep 14, 2024

@fluid-example/bundle-size-tests: +1.32 KB
Metric NameBaseline SizeCompare SizeSize Diff
aqueduct.js 460.31 KB 460.61 KB +311 Bytes
azureClient.js 558.24 KB 558.56 KB +325 Bytes
connectionState.js 680 Bytes 680 Bytes No change
containerRuntime.js 261.07 KB 261.35 KB +290 Bytes
fluidFramework.js 401.54 KB 401.55 KB +14 Bytes
loader.js 134.19 KB 134.2 KB +14 Bytes
map.js 42.43 KB 42.44 KB +7 Bytes
matrix.js 146.58 KB 146.58 KB +7 Bytes
odspClient.js 525.59 KB 525.91 KB +325 Bytes
odspDriver.js 97.8 KB 97.82 KB +21 Bytes
odspPrefetchSnapshot.js 42.76 KB 42.78 KB +14 Bytes
sharedString.js 163.3 KB 163.31 KB +7 Bytes
sharedTree.js 392 KB 392.01 KB +7 Bytes
Total Size 3.3 MB 3.3 MB +1.32 KB

Baseline commit: da0c9fa

Generated by 🚫 dangerJS against 6360ecf

@markfields markfields marked this pull request as ready for review September 14, 2024 02:10
@markfields markfields requested a review from a team as a code owner September 14, 2024 02:19
Copy link
Contributor

@alexvy86 alexvy86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changeset looks good.

Copy link
Contributor

@dannimad dannimad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me although I get a strong need to make the op processing clearer. We can consider a refactor in a subsequent PR

@markfields markfields merged commit 709f085 into microsoft:main Sep 16, 2024
34 checks passed
@markfields markfields deleted the o3/op-by-op-go branch September 16, 2024 17:50
markfields added a commit to markfields/FluidFramework that referenced this pull request Sep 26, 2024
ContainerRuntime: Process incoming batches op-by-op instead of waiting for the whole batch (microsoft#22508)

We are concerned that holding batch messages in ContainerRuntime even
while DeltaManager advances its tracked sequence numbers through the
batch could have unintended consequences. So this PR restores the old
behavior of processing each message in a batch one-by-one, rather than
holding until the whole batch arrives.

Note that there's no change in behavior here for Grouped Batches.

PR microsoft#21785 switched the RemoteMessageProcessor from returning ungrouped
batch ops as they arrived, to holding them and finally returning the
whole batch once the last arrived. The downstream code was also updated
to take whole batches, whereas before it would take individual messages
and use the batch metadata to detect batch start/end.

Too many other changes were made after that PR to straight revert it.
Logic was added throughout CR and PSM that looks at info about that
batch which is found on the first message in the batch. So we can
reverse the change and process one-at-a-time, but we need a way to carry
around that "batch start info" with the first message in the batch.

So we are now modeling the result that RMP yields as one of three cases:

- A full batch of messages (could be from a single-message batch or a
Grouped Batch)
- The first message of a multi-message batch
- The next message in a multi-message batch

The first two cases include the "batch start info" needed for the recent
Offline work. The third case just indicates whether it's the last
message or not.

"batch start info" and updating the downstream code to use that instead
of reading it off the old "Inbound Batch" type. This PR now adds those
other two cases to the RMP return type and handles processing them
throughout CR and PSM.
frankmueller-msft pushed a commit that referenced this pull request Oct 1, 2024
…tches op-by-op instead of waiting for the whole batch) (#22654)

## Description

Porting the following commits from `main` to `release/client/2.2`:

* ce0a14c (Offline: Model empty batches
differently, exposing the empty Grouped Batch message to more of the
system)
* e024b86 (ContainerRuntime: Refactor
batch processing code to support either op-by-op or batch-all-at-once
semantics)
* 709f085 (ContainerRuntime: Process
incoming batches op-by-op instead of waiting for the whole batch)

## Reviewer Guidance

There was a ton of churn between 2.2 and 2.3 in this code (subset of
[these changes to container-runtime
dir](https://github.com/microsoft/FluidFramework/commits/release/client/2.3/packages/runtime/container-runtime?since=2024-08-17)),
so the cherry-pick turned into more like a manual re-implementing of the
change. In some ways it's a smaller diff compared to what needed to
happen in `main`.

So please review carefully as you would an original PR.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: runtime Runtime related issues base: main PRs targeted against main branch changeset-present
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants