-
Notifications
You must be signed in to change notification settings - Fork 536
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make Container Attach Retriable for Non-Runtime Failures #19246
Make Container Attach Retriable for Non-Runtime Failures #19246
Conversation
…to make-attach-retyable
…to make-attach-retyable
…to make-attach-retyable
…to make-attach-retyable
…to make-attach-retyable
…phy/FluidFramework-1 into make-attach-retyable
Co-authored-by: Matt Rakow <[email protected]>
…phy/FluidFramework-1 into make-attach-retyable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM and I don't see any outstanding comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little nervous about expanding the capabilities of attach (adding retriability) given that it's a bit convoluted already, but this seems like a fine approach.
One scary comment to consider, rest are more minor.
Added some new tests in #19246 that are failing CI due to the increased the compat matrix. This moves them to the reduced compat section of the file to prevent failures. Co-authored-by: Tony Murphy <[email protected]>
…9246) ### Background This change is in preparation for allowing consumers of fluid to call serialize on a container while a container is in an attaching state, which will include a failed attach where the failure didn't cause the container to close. This will aid in the case of offline, network, or server outage by allowing the application to capture the un-attached state of the container, preserve it locally, and later create a new file from it. AB#5502 ### Overview In order to make attachment retriable I've made explicit types that represent each stage of the attachment state machine and called it `AttachmentData`. The container tracks the current attachment data, gets the `AttachState`, and gets the base snapshot information in the case of offline for it. The biggest changes are to the `attach` method itself. Now that is retriable, we had to remove the `_attachStarted` member which gated the call to a single instance. I've replaced that with the `runSingle` helper method which ensure only one attach promise is running at a time and ensure that all the parameters for all parallel watchers of that promise used the same arguments. I've also encapsulated the management of the state machine in a new method `runRetriableAttachProcess `. The primary reason for this is testability, as the new method is easily unit testable, and I've added a number of unit test to validate its behavior. testability was achieved by limiting the number of dependencies the method needed, so it uses callbacks to perform the actual operations necessary to attach the container. The split also helped to segment the error handling for the container. Specifically, only the callbacks that engage with the runtime will close the container on failure, as if we can't create a summary, or propagate an attachment state, we can't safely retry. The container creates the callbacks for `runRetriableAttachProcess ` as follows: - `setAttachmentData` manages the attachment data member, which is exposed via AttachState, and performs eventing and runtime callbacks. - `createAttachmentSummary` creates the summary from the runtime, add the protocol tree, and the closes the container if any of that fails - `createOrGetStorageService` resolves the request, calls create container, and sets up the containers document services. it only returns the storage service, as that is all that might be needed to upload blobs. Lastly the container's attach method configures the delta connection, and logs. As was previously the case, we still error if attach is called for an invalid state, and we still limit serialize calls to only being supported in a detached state. AB#5911 --------- Co-authored-by: Tony Murphy <[email protected]> Co-authored-by: jzaffiro <[email protected]> Co-authored-by: Matt Rakow <[email protected]>
…oft#19402) Added some new tests in microsoft#19246 that are failing CI due to the increased the compat matrix. This moves them to the reduced compat section of the file to prevent failures. Co-authored-by: Tony Murphy <[email protected]>
…19893) ## Background This change is the final change for allowing consumers of fluid to call serialize on a container while a container is in an attaching state, which will include a failed attach where the failure didn't cause the container to close. This will aid in the case of offline, network, or server outage by allowing the application to capture the un-attached state of the container, preserve it locally, and later create a new file from it. [AB#5502](https://dev.azure.com/fluidframework/235294da-091d-4c29-84fc-cdfc3d90890b/_workitems/edit/5502) ## Overview This change enables calling container.serialize while a container is in the attaching state. It is the culmination of many smaller changes that have gone in to enable this change. - #18829 - #19400 - #19246 - #19517 - #19518 - #19590 - #19634 - #19802 --------- Co-authored-by: Tony Murphy <[email protected]> Co-authored-by: Daniel Madrid <[email protected]> Co-authored-by: jzaffiro <[email protected]> Co-authored-by: Matt Rakow <[email protected]>
Background
This change is in preparation for allowing consumers of fluid to call serialize on a container while a container is in an attaching state, which will include a failed attach where the failure didn't cause the container to close. This will aid in the case of offline, network, or server outage by allowing the application to capture the un-attached state of the container, preserve it locally, and later create a new file from it.
AB#5502
Overview
In order to make attachment retriable I've made explicit types that represent each stage of the attachment state machine and called it
AttachmentData
. The container tracks the current attachment data, gets theAttachState
, and gets the base snapshot information in the case of offline for it.The biggest changes are to the
attach
method itself.Now that is retriable, we had to remove the
_attachStarted
member which gated the call to a single instance. I've replaced that with therunSingle
helper method which ensure only one attach promise is running at a time and ensure that all the parameters for all parallel watchers of that promise used the same arguments.I've also encapsulated the management of the state machine in a new method
runRetriableAttachProcess
. The primary reason for this is testability, as the new method is easily unit testable, and I've added a number of unit test to validate its behavior. testability was achieved by limiting the number of dependencies the method needed, so it uses callbacks to perform the actual operations necessary to attach the container.The split also helped to segment the error handling for the container. Specifically, only the callbacks that engage with the runtime will close the container on failure, as if we can't create a summary, or propagate an attachment state, we can't safely retry.
The container creates the callbacks for
runRetriableAttachProcess
as follows:setAttachmentData
manages the attachment data member, which is exposed via AttachState, and performs eventing and runtime callbacks.createAttachmentSummary
creates the summary from the runtime, add the protocol tree, and the closes the container if any of that failscreateOrGetStorageService
resolves the request, calls create container, and sets up the containers document services. it only returns the storage service, as that is all that might be needed to upload blobs.Lastly the container's attach method configures the delta connection, and logs. As was previously the case, we still error if attach is called for an invalid state, and we still limit serialize calls to only being supported in a detached state.
AB#5911