Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Server Changes in Fluid Server Cluster Discovery Feature #9685

Merged
merged 36 commits into from
Apr 8, 2022

Conversation

tianzhu007
Copy link
Contributor

@tianzhu007 tianzhu007 commented Mar 30, 2022

To implement server discovery, we need to add the following changes to the server side:

  1. New interface (ISession): Used to store the Fluid server cluster’s URLs and session alive flag. This session information will be returned by the discovery call in create session and join session flow.
  2. Database changes:
    a. Add the session information to the document metadata table on container creation.
    b. When the session ends, we will mark the isSessionAlive flag as false in session information in the deli lambda.
  3. New API to get session information: A GET API called during the join flow to return the correct session information to the client. This api will return the session information if there is an existing session and will create a session and update the session information and then return the session details if there is no alive session.
  4. We need to disable and update the cache, the first time the client creates the new session. Otherwise, the client might read the outdated summary from a cache in the old cluster. For example:
       - Client A opens a document in the US, generates some ops and closes the session. The final summary with the sequence # 20 is cached.
       - Client B then opens the same document and creates a session in the EU, generates some ops, and closes the session. The final summary with the sequence # 30 is cached in EU.
       - Client A reopens the document from the US and a new session is created. If we do not disable the cache, client A reads the summary with the summary sequence number of 20 which would be incorrect. Therefore, we need to let client A read the summary directly from the storage, instead of the cache, and update the cache with the fetched summary (sequence number 30).

@tianzhu007 tianzhu007 requested review from a team as code owners March 30, 2022 22:45
@tianzhu007 tianzhu007 changed the title Server Changes for Enable Routing for FluidFramework Server Changes for Enabling Routing for FluidFramework Mar 30, 2022
@github-actions github-actions bot added area: server Server related issues (routerlicious) public api change Changes to a public API labels Mar 30, 2022
@tianzhu007 tianzhu007 changed the title Server Changes for Enabling Routing for FluidFramework Server Changes for Enabling Routing Feature for FluidFramework Mar 30, 2022
@github-actions github-actions bot added the base: main PRs targeted against main branch label Mar 30, 2022
@tianzhu007 tianzhu007 marked this pull request as draft March 30, 2022 22:53
@tianzhu007 tianzhu007 changed the title Server Changes for Enabling Routing Feature for FluidFramework Server Changes for Enabling Discovery Feature for FluidFramework Mar 30, 2022
@tianzhu007 tianzhu007 linked an issue Mar 30, 2022 that may be closed by this pull request
@tianzhu007 tianzhu007 changed the title Server Changes for Enabling Discovery Feature for FluidFramework Server Changes in Fluid Server Cluster Discovery Feature Mar 31, 2022
@tianzhu007 tianzhu007 requested a review from a team as a code owner April 5, 2022 02:46
@github-actions github-actions bot added the dependencies Pull requests that update a dependency file label Apr 5, 2022
@github-actions github-actions bot removed the dependencies Pull requests that update a dependency file label Apr 5, 2022
Copy link
Contributor

@hedasilv hedasilv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @tianzhu007, thanks for working on this PR! I just left a few follow-up comments, the most important one related to our discussion regarding caching the summary in historian.

Also, curious: what was your testing strategy after you pushed the changes to address comments? Just want to make sure we test this thoroughly as this is an important change.

Finally, it would be good to get @pradeepvairamani and/or @znewton sign off besides mine before merging just in case I missed something 😄

@tianzhu007 tianzhu007 closed this Apr 7, 2022
@tianzhu007 tianzhu007 closed this Apr 7, 2022
@tianzhu007 tianzhu007 reopened this Apr 7, 2022
@tianzhu007 tianzhu007 reopened this Apr 7, 2022
@tianzhu007
Copy link
Contributor Author

tianzhu007 commented Apr 7, 2022

Hey @tianzhu007, thanks for working on this PR! I just left a few follow-up comments, the most important one related to our discussion regarding caching the summary in historian.

Also, curious: what was your testing strategy after you pushed the changes to address comments? Just want to make sure we test this thoroughly as this is an important change.

Finally, it would be good to get @pradeepvairamani and/or @znewton sign off besides mine before merging just in case I missed something 😄

Thanks for reviewing it. :)
I used the driver with/without the discovery feature test the server-side change locally.
Afterward, I will deploy both r11s, and historian images to one of the dev clusters, and test it by both drivers with/without the discovery feature.

@tianzhu007 tianzhu007 closed this Apr 8, 2022
@tianzhu007 tianzhu007 reopened this Apr 8, 2022
@tianzhu007 tianzhu007 merged commit 86e4ed7 into microsoft:main Apr 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: server Server related issues (routerlicious) base: main PRs targeted against main branch public api change Changes to a public API
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fluid Server Discovery
5 participants