-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(api-graphql): Data messages should maintain the keep alive status #14164
Conversation
@@ -147,6 +147,10 @@ describe('AWSAppSyncRealTimeProvider', () => { | |||
Object.defineProperty(constants, 'RECONNECT_DELAY', { | |||
value: 100, | |||
}); | |||
// Reduce the keep alive heartbeat to 10ms |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For testing we always reduce the heartbeat to 10ms, but only reduce the alert and failure times in the tests that are exercising those event conditions.
await fakeWebSocketInterface?.waitUntilConnectionStateIn([ | ||
CS.Connected, | ||
]); | ||
await fakeWebSocketInterface?.waitUntilConnectionStateIn([ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementation changed to always access the constants directly where before they where saved in private/instance variables, so these events need to be scoped inside of the constant mocking now.
packages/api-graphql/src/Providers/AWSWebSocketProvider/index.ts
Outdated
Show resolved
Hide resolved
} | ||
|
||
// Recognize we are disconnected if we haven't seen messages in the keep alive timeout period | ||
if ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sanity check because of unfamiliarity but are we expecting that we might record KA/KA missed above and then disconnect here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reconnect change introduced multiple chatty connection state observables, like keep alive. Rather than adding logic to track current state and check the state to avoid sending a duplicate state, the states observable is pipe'ed through a filter that removes duplicates before they are surfaced to the user.
In this instance, we could check is the connection is times out and return. I can't think of a case where the disconnect would occur on the same check as the first keep alive alert. Its possible that there are suspense/recovery patterns that don't break the web socket that I'm not aware of.
Since KA missed and disconnect are both checks against time, I wouldn't expect KA -> Disconnect logically, but if the timeout constants where changed, I accept that this outcome is possible given how the code is structured right now.
packages/api-graphql/src/Providers/AWSWebSocketProvider/index.ts
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Description of changes
Problem: In low bandwidth or high js thread contention cases, the AppSync websocket can get backed up. If it gets backed up to the point where 1 minute of data can't be processed in 5 minutes, then no keep alive messages arrive which leads the client to think that the connection is dead, prompting disconnect.
Solution: Any receipt of data should count as a keep alive message.
Next Problem: We receive substantial more data than keep alive messages and using the existing keep alive maintenance logic with each fresh piece of data is expensive. In testing for 500,000 messages received the processing time goes from ~2100ms up to ~3000ms by adding this logic.
Next Solution: Revise the keep alive maintenance logic to capture a timestamp, then have a heartbeat interval that reviews the keep alive alert and disconnect conditions every 5 seconds. As a result, the processing time of maintaining the keep alive heartbeat for each receipt of data for 500,000 messages is ~2150ms
Issue #, if available
Description of how you validated changes
A unit test was added to validate the change. Tested failing without the logic update, passing with the data maintenance logic.
I will be performing manual app testing on Monday 1/27.
Checklist
yarn test
passesChecklist for repo maintainers
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.