Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add connection freshness check #390

Merged
merged 26 commits into from
May 18, 2018

Conversation

funkyboy
Copy link
Contributor

@funkyboy funkyboy commented May 9, 2018

Fix #358

  • Add connection freshness check
  • Test reconnection after ttl + idleTime has passed
  • Test reconnection before ttl + idleTime has passed
  • Test channels attached during the first connection are correctly reattached after reconnecting when ttl + idleTime has passed
  • Include heartbeats in lastActivityTime
  • Fix failing channels_are_reattached_after_reconnecting_when_statettl_plus_idleinterval_has_passed. Reason: channels stay in an ATTACHED state even when the connection freshness check disposes of the old connection and creates a new one.

@funkyboy funkyboy changed the title WIP: Add connection freshness check Add connection freshness check May 10, 2018
@funkyboy funkyboy requested a review from paddybyers May 10, 2018 14:23
@funkyboy funkyboy force-pushed the add-connection-freshness-check branch from 11d348f to db571df Compare May 10, 2018 14:29
@funkyboy funkyboy force-pushed the add-connectionStateTtl branch from bd36890 to 56c30f2 Compare May 10, 2018 14:57
@funkyboy funkyboy force-pushed the add-connection-freshness-check branch from db571df to c75d55e Compare May 10, 2018 17:44
@funkyboy
Copy link
Contributor Author

Rebased on add-connectionStateTtl

Copy link
Member

@paddybyers paddybyers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks

@Override
public void onConnectionStateChanged(ConnectionStateChange state) {
try {
Field connectionStateField = ably.connection.connectionManager.getClass().getDeclaredField("connectionStateTtl");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could consider having a helper for this? Maybe it doesn't really simplify

* We want it to stay into a disconnected state long enough
* so that the connection becomes stale.
*/
ably.connection.connectionManager.wait(waitInDisconnectedState);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what the aim of this wait() is for. Won't this code be blocking the connectionmanager thread? Was that the intention?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that was the intention i.e. the time required for connection to become stale will pass, even if blocked.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@paddybyers What @mattheworiordan is correct. Every attempt to go into a disconnected state using .requestState(ConnectionState.disconnected) will try to reconnect immediately. We want the connection manager to stay put in the disconnected state for a bit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Every attempt to go into a disconnected state using .requestState(ConnectionState.disconnected) will try to reconnect immediately.

This is no longer true because I added the private method to suppress retries. I don't like blocking the connectionmanager thread explicitly here, because it stops any other processing from happening by the connectionmanager, and so we can no longer validly draw any conclusions about what its behaviour will be in a real situation. I think this wait() should be removed, and instead this test updated to use the new method.

Copy link
Member

@mattheworiordan mattheworiordan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll review this in more detail later, but I see some issues here I am afraid

@@ -205,6 +205,7 @@ public String getHost() {
*********************/

public void connect() {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why white space here?

@@ -451,6 +452,7 @@ public void onMessage(ITransport transport, ProtocolMessage message) throws Ably
if (Log.level <= Log.VERBOSE)
Log.v(TAG, "onMessage(): " + message.action + ": " + new String(ProtocolSerializer.writeJSON(message)));
try {
lastActivity = System.currentTimeMillis();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about WebSocket heartbeats? I believe this code will only catch protocol messages, yet WebSocket heartbeats are considered valid for freshness activity.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't allow for WebSocket heartbeats we're now going to incorrectly move people into the suspended state now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact there already is support for tracking activity at the transport level, which includes heartbeats:
https://github.com/ably/ably-java/blob/develop/lib/src/main/java/io/ably/lib/transport/WebSocketTransport.java#L188

The staleness check needs to be based on this instead of introducing a new mechanism here.

private boolean checkConnectionStale() {
if(lastActivity == 0) {
return false;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indention issues.

private void handleStateChange(StateIndication stateChange) {
/* if we have had a disconnected state indication
* from the transport then we have to decide whether
* to transition to closed, disconnected to suspended depending
* on when we last had a successful connection */
if(stateChange.state == ConnectionState.disconnected) {
if(checkConnectionStale()) {
requestState(ConnectionState.suspended);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we not add a Log.v(TAG) statement here for debugging purposes?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be useful in future to know why a connection moved suddenly to suspended

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The more I think about this change the less I like it. The spec changes for stale connection state are at https://github.com/ably/docs/pull/331/files and make no mention of this change, so is it really sensible now to change this? As far as I can tell from the spec at https://docs.ably.io/client-lib-development-guide/features/#RTN14e, we state a connection moves to the SUSPENDED state only following a connection failure attempt, yet now we are moving to the suspended state as a result of DISCONNECTED. Additionally, what side effects may there be now that are unexpected? For example, previously when a connection first became disconnected it would immediately retry to connect, so if a device went to sleep, as soon as it wakes up, it will reconnect following it moving to the DISCONNECTED state. Now, we can't be sure that that happens because moving to the SUSPENDED state will result in the connection only being tried every 30 seconds. So have we now with this change potentially made our clients 30s less responsive by default when a device comes out of sleep? I would really like to follow why we are explicitly making this change here in this PR when as far as I can tell, this change has nothing to do with the spec change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See https://www.ably.io/documentation/realtime/connection#connection-state-explained btw. In the examples we provide in the sequence of events, we never show a state change from disconnected -> suspended either.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@funkyboy FYI @paddybyers explained to me that this on state change handler is for the transport, not the connection, so my comments above are incorrect. Sorry, I misunderstood

if(intervalSinceLastActivity > (maxIdleInterval + connectionStateTtl)) {
/* RTN15g1, RTN15g2 Force a new connection if the previous one is stale */
if(connection.id != null) {
connection.id = null;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should add a log message here about why the connection state is explicitly being cleared

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/* Since newTtl + newIdleInterval has passed we expect the connection to go into a suspended state */
connectionWaiter.waitFor(ConnectionState.suspended);

ably.connect();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation issues

@Override
public void onChannelStateChanged(ChannelStateChange stateChange) {
System.out.println("Channel reattached after a new connection has been established");
ably.close();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How we do know the test ends at this point? Surely we need a clause here to confirm that this test has finished i.e. could this test not complete before this, or am I not reading this correctly?

Copy link
Member

@mattheworiordan mattheworiordan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see my comments, I don't think this is ready

@@ -669,12 +671,34 @@ private void handleStateRequest() {
requestedState = null;
}

private boolean checkConnectionStale() {
if(lastActivity == 0) {
return false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this not true, as if the last activity is not recorded, I assume the connection is stale / not initialised? In fact, what does the boolean result even represent. This method is called checkConnectionStale, yet returns a boolean value. So does that mean connection is stale if true, if so, should this not be isConnectionStale. And if it is that, it should not have side effects. This method IMHO is doing two things and it should not. Is it an accessor or a modifier?

private void handleStateChange(StateIndication stateChange) {
/* if we have had a disconnected state indication
* from the transport then we have to decide whether
* to transition to closed, disconnected to suspended depending
* on when we last had a successful connection */
if(stateChange.state == ConnectionState.disconnected) {
if(checkConnectionStale()) {
requestState(ConnectionState.suspended);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The more I think about this change the less I like it. The spec changes for stale connection state are at https://github.com/ably/docs/pull/331/files and make no mention of this change, so is it really sensible now to change this? As far as I can tell from the spec at https://docs.ably.io/client-lib-development-guide/features/#RTN14e, we state a connection moves to the SUSPENDED state only following a connection failure attempt, yet now we are moving to the suspended state as a result of DISCONNECTED. Additionally, what side effects may there be now that are unexpected? For example, previously when a connection first became disconnected it would immediately retry to connect, so if a device went to sleep, as soon as it wakes up, it will reconnect following it moving to the DISCONNECTED state. Now, we can't be sure that that happens because moving to the SUSPENDED state will result in the connection only being tried every 30 seconds. So have we now with this change potentially made our clients 30s less responsive by default when a device comes out of sleep? I would really like to follow why we are explicitly making this change here in this PR when as far as I can tell, this change has nothing to do with the spec change.

private void handleStateChange(StateIndication stateChange) {
/* if we have had a disconnected state indication
* from the transport then we have to decide whether
* to transition to closed, disconnected to suspended depending
* on when we last had a successful connection */
if(stateChange.state == ConnectionState.disconnected) {
if(checkConnectionStale()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Appears to be typos in the comments above.

Copy link
Contributor Author

@funkyboy funkyboy May 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mattheworiordan @paddybyers This is quite a rare case. Might happen when the app is open, the connection is active but nothing is being sent for quite a while. I don't see it harmful but I agree with Matt that it's beyond the scope of this spec change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well I misunderstood that stateChange was a connection state change, if it's a transport state change I am in fact OK with this. Sorry for the noise

private void handleStateChange(StateIndication stateChange) {
/* if we have had a disconnected state indication
* from the transport then we have to decide whether
* to transition to closed, disconnected to suspended depending
* on when we last had a successful connection */
if(stateChange.state == ConnectionState.disconnected) {
if(checkConnectionStale()) {
requestState(ConnectionState.suspended);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See https://www.ably.io/documentation/realtime/connection#connection-state-explained btw. In the examples we provide in the sequence of events, we never show a state change from disconnected -> suspended either.

ably.connection.connectionManager.requestState(ConnectionState.disconnected);
connectionWaiter.waitFor(ConnectionState.disconnected);

ably.connection.once(ConnectionEvent.connected, new ConnectionStateListener() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation issues

}
});

connectionWaiter.waitFor(ConnectionState.connected);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be wait for closed so that you can be 100% confident the onConnectionStateChanged ran above? Otherwise, this is racy and the test could complete before those assertions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mattheworiordan given that we have waited for disconnected before I think this is always gonna wait for connected and thus trigger the handler. That said, waiting for closed can't harm :) 6892ecc

});
channel.attach();

/* Before disconnecting wait that newTtl + newIdleInterval has passed */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation issues again

assertEquals("Channel is attached", stateChange.current, ChannelState.attached);
}
});
channel.attach();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're not explicitly awaiting the channel attached, and you cannot be sure the onChannelStateChanged handler ran with this code path.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ably.connection.connectionManager.requestState(ConnectionState.disconnected);

/* Since newTtl + newIdleInterval has passed we expect the connection to go into a suspended state */
connectionWaiter.waitFor(ConnectionState.suspended);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I agree with this, as above... I think the connection state should be lost, but there is no reason it should become suspended according to the spec.

}
});
}
});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are not waiting for those to close, this entire section could easily never be run as part of this test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@paddybyers
Copy link
Member

The more I think about this change the less I like it.

I think you're misunderstanding what this is doing. This is the check, that already existed, which determines which state to enter (DISCONNECTED vs SUSPENDED) when there is an indication of disconnection (which includes a failure to connect) from the transport. It's not a transition of the connection state from DISCONNECTED to SUSPENDED.

The only change here that is being made in this PR is that the staleness check is being used at the time of a new connection attempt, in addition to at the time of a disconnection or failed connection attempt.

@funkyboy funkyboy force-pushed the add-connection-freshness-check branch from e973176 to 3a868ee Compare May 15, 2018 17:30
paddybyers and others added 7 commits May 16, 2018 12:09
…attached channels do not enter suspended when handling a discontinuity; add a private method on ConnectionManager to support tests by dropping a connection and not automatically retrying it
… ensure that the resulting (clean) connection is handled as a resume failure, triggering reattach for discontinuity of attached channels
@funkyboy funkyboy force-pushed the add-connection-freshness-check branch from d964888 to 0521e14 Compare May 17, 2018 09:25
@funkyboy
Copy link
Contributor Author

@paddybyers @mattheworiordan ready for another look

@funkyboy
Copy link
Contributor Author

Note that this will be merged into https://github.com/ably/ably-java/tree/add-connectionStateTtl, which is part of another ongoing PR #389

Copy link
Member

@mattheworiordan mattheworiordan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICT this looks good. However, do you know why the build failed?

connectionWaiter.waitFor(ConnectionState.disconnected);
assertEquals("Disconnected state was not reached", ConnectionState.disconnected, ably.connection.state);
System.out.println("=============================== DISCONNECTED ");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are these for lines for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That were meant for debugging but have been deleted in this commit 6ab31f7#diff-4cfebe2518c34c9f3975cde19f3ced08L598

@@ -603,7 +618,10 @@ public void onChannelStateChanged(ChannelStateChange stateChange) {
ably.close();
}
});
(new Helpers.ChannelWaiter(channel)).waitFor(ChannelState.attached);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will you please sort out your IDE/editor? It's getting tiring spending time commenting on whitespace I am afraid.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also this is an old commit and is now fixed.

@funkyboy
Copy link
Contributor Author

@mattheworiordan the build failed because there are still issues with crypto tests. Ongoing PR here. And some issues with history on realtime.

Copy link
Member

@paddybyers paddybyers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think one change remaining as commented but otherwise LGTM.

@funkyboy
Copy link
Contributor Author

@paddybyers done 5da11f6

@funkyboy funkyboy force-pushed the add-connection-freshness-check branch from 5da11f6 to 810cef4 Compare May 18, 2018 12:26
@funkyboy funkyboy force-pushed the add-connection-freshness-check branch from 810cef4 to b681cd8 Compare May 18, 2018 12:28
@funkyboy funkyboy merged commit c0a3575 into add-connectionStateTtl May 18, 2018
@funkyboy funkyboy deleted the add-connection-freshness-check branch May 18, 2018 12:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants