-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add connection freshness check #390
Add connection freshness check #390
Conversation
11d348f
to
db571df
Compare
bd36890
to
56c30f2
Compare
Should wait for the connection to be suspended
and extract it into a method
db571df
to
c75d55e
Compare
Rebased on add-connectionStateTtl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks
@Override | ||
public void onConnectionStateChanged(ConnectionStateChange state) { | ||
try { | ||
Field connectionStateField = ably.connection.connectionManager.getClass().getDeclaredField("connectionStateTtl"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could consider having a helper for this? Maybe it doesn't really simplify
* We want it to stay into a disconnected state long enough | ||
* so that the connection becomes stale. | ||
*/ | ||
ably.connection.connectionManager.wait(waitInDisconnectedState); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand what the aim of this wait()
is for. Won't this code be blocking the connectionmanager thread? Was that the intention?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that was the intention i.e. the time required for connection to become stale will pass, even if blocked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@paddybyers What @mattheworiordan is correct. Every attempt to go into a disconnected state using .requestState(ConnectionState.disconnected)
will try to reconnect immediately. We want the connection manager to stay put in the disconnected state for a bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Every attempt to go into a disconnected state using .requestState(ConnectionState.disconnected) will try to reconnect immediately.
This is no longer true because I added the private method to suppress retries. I don't like blocking the connectionmanager thread explicitly here, because it stops any other processing from happening by the connectionmanager, and so we can no longer validly draw any conclusions about what its behaviour will be in a real situation. I think this wait()
should be removed, and instead this test updated to use the new method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll review this in more detail later, but I see some issues here I am afraid
@@ -205,6 +205,7 @@ public String getHost() { | |||
*********************/ | |||
|
|||
public void connect() { | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why white space here?
@@ -451,6 +452,7 @@ public void onMessage(ITransport transport, ProtocolMessage message) throws Ably | |||
if (Log.level <= Log.VERBOSE) | |||
Log.v(TAG, "onMessage(): " + message.action + ": " + new String(ProtocolSerializer.writeJSON(message))); | |||
try { | |||
lastActivity = System.currentTimeMillis(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about WebSocket heartbeats? I believe this code will only catch protocol messages, yet WebSocket heartbeats are considered valid for freshness activity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we don't allow for WebSocket heartbeats we're now going to incorrectly move people into the suspended state now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact there already is support for tracking activity at the transport level, which includes heartbeats:
https://github.com/ably/ably-java/blob/develop/lib/src/main/java/io/ably/lib/transport/WebSocketTransport.java#L188
The staleness check needs to be based on this instead of introducing a new mechanism here.
private boolean checkConnectionStale() { | ||
if(lastActivity == 0) { | ||
return false; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indention issues.
private void handleStateChange(StateIndication stateChange) { | ||
/* if we have had a disconnected state indication | ||
* from the transport then we have to decide whether | ||
* to transition to closed, disconnected to suspended depending | ||
* on when we last had a successful connection */ | ||
if(stateChange.state == ConnectionState.disconnected) { | ||
if(checkConnectionStale()) { | ||
requestState(ConnectionState.suspended); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we not add a Log.v(TAG)
statement here for debugging purposes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be useful in future to know why a connection moved suddenly to suspended
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The more I think about this change the less I like it. The spec changes for stale connection state are at https://github.com/ably/docs/pull/331/files and make no mention of this change, so is it really sensible now to change this? As far as I can tell from the spec at https://docs.ably.io/client-lib-development-guide/features/#RTN14e, we state a connection moves to the SUSPENDED
state only following a connection failure attempt, yet now we are moving to the suspended state as a result of DISCONNECTED
. Additionally, what side effects may there be now that are unexpected? For example, previously when a connection first became disconnected it would immediately retry to connect, so if a device went to sleep, as soon as it wakes up, it will reconnect following it moving to the DISCONNECTED
state. Now, we can't be sure that that happens because moving to the SUSPENDED
state will result in the connection only being tried every 30 seconds. So have we now with this change potentially made our clients 30s less responsive by default when a device comes out of sleep? I would really like to follow why we are explicitly making this change here in this PR when as far as I can tell, this change has nothing to do with the spec change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See https://www.ably.io/documentation/realtime/connection#connection-state-explained btw. In the examples we provide in the sequence of events, we never show a state change from disconnected -> suspended either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@funkyboy FYI @paddybyers explained to me that this on state change handler is for the transport, not the connection, so my comments above are incorrect. Sorry, I misunderstood
if(intervalSinceLastActivity > (maxIdleInterval + connectionStateTtl)) { | ||
/* RTN15g1, RTN15g2 Force a new connection if the previous one is stale */ | ||
if(connection.id != null) { | ||
connection.id = null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should add a log message here about why the connection state is explicitly being cleared
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mattheworiordan done 0a1c622
/* Since newTtl + newIdleInterval has passed we expect the connection to go into a suspended state */ | ||
connectionWaiter.waitFor(ConnectionState.suspended); | ||
|
||
ably.connect(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indentation issues
@Override | ||
public void onChannelStateChanged(ChannelStateChange stateChange) { | ||
System.out.println("Channel reattached after a new connection has been established"); | ||
ably.close(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How we do know the test ends at this point? Surely we need a clause here to confirm that this test has finished i.e. could this test not complete before this, or am I not reading this correctly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see my comments, I don't think this is ready
@@ -669,12 +671,34 @@ private void handleStateRequest() { | |||
requestedState = null; | |||
} | |||
|
|||
private boolean checkConnectionStale() { | |||
if(lastActivity == 0) { | |||
return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this not true, as if the last activity is not recorded, I assume the connection is stale / not initialised? In fact, what does the boolean result even represent. This method is called checkConnectionStale
, yet returns a boolean value. So does that mean connection is stale if true, if so, should this not be isConnectionStale
. And if it is that, it should not have side effects. This method IMHO is doing two things and it should not. Is it an accessor or a modifier?
private void handleStateChange(StateIndication stateChange) { | ||
/* if we have had a disconnected state indication | ||
* from the transport then we have to decide whether | ||
* to transition to closed, disconnected to suspended depending | ||
* on when we last had a successful connection */ | ||
if(stateChange.state == ConnectionState.disconnected) { | ||
if(checkConnectionStale()) { | ||
requestState(ConnectionState.suspended); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The more I think about this change the less I like it. The spec changes for stale connection state are at https://github.com/ably/docs/pull/331/files and make no mention of this change, so is it really sensible now to change this? As far as I can tell from the spec at https://docs.ably.io/client-lib-development-guide/features/#RTN14e, we state a connection moves to the SUSPENDED
state only following a connection failure attempt, yet now we are moving to the suspended state as a result of DISCONNECTED
. Additionally, what side effects may there be now that are unexpected? For example, previously when a connection first became disconnected it would immediately retry to connect, so if a device went to sleep, as soon as it wakes up, it will reconnect following it moving to the DISCONNECTED
state. Now, we can't be sure that that happens because moving to the SUSPENDED
state will result in the connection only being tried every 30 seconds. So have we now with this change potentially made our clients 30s less responsive by default when a device comes out of sleep? I would really like to follow why we are explicitly making this change here in this PR when as far as I can tell, this change has nothing to do with the spec change.
private void handleStateChange(StateIndication stateChange) { | ||
/* if we have had a disconnected state indication | ||
* from the transport then we have to decide whether | ||
* to transition to closed, disconnected to suspended depending | ||
* on when we last had a successful connection */ | ||
if(stateChange.state == ConnectionState.disconnected) { | ||
if(checkConnectionStale()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Appears to be typos in the comments above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mattheworiordan @paddybyers This is quite a rare case. Might happen when the app is open, the connection is active but nothing is being sent for quite a while. I don't see it harmful but I agree with Matt that it's beyond the scope of this spec change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well I misunderstood that stateChange
was a connection state change, if it's a transport state change I am in fact OK with this. Sorry for the noise
private void handleStateChange(StateIndication stateChange) { | ||
/* if we have had a disconnected state indication | ||
* from the transport then we have to decide whether | ||
* to transition to closed, disconnected to suspended depending | ||
* on when we last had a successful connection */ | ||
if(stateChange.state == ConnectionState.disconnected) { | ||
if(checkConnectionStale()) { | ||
requestState(ConnectionState.suspended); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See https://www.ably.io/documentation/realtime/connection#connection-state-explained btw. In the examples we provide in the sequence of events, we never show a state change from disconnected -> suspended either.
ably.connection.connectionManager.requestState(ConnectionState.disconnected); | ||
connectionWaiter.waitFor(ConnectionState.disconnected); | ||
|
||
ably.connection.once(ConnectionEvent.connected, new ConnectionStateListener() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indentation issues
} | ||
}); | ||
|
||
connectionWaiter.waitFor(ConnectionState.connected); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be wait for closed
so that you can be 100% confident the onConnectionStateChanged
ran above? Otherwise, this is racy and the test could complete before those assertions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mattheworiordan given that we have waited for disconnected
before I think this is always gonna wait for connected and thus trigger the handler. That said, waiting for closed
can't harm :) 6892ecc
}); | ||
channel.attach(); | ||
|
||
/* Before disconnecting wait that newTtl + newIdleInterval has passed */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indentation issues again
assertEquals("Channel is attached", stateChange.current, ChannelState.attached); | ||
} | ||
}); | ||
channel.attach(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're not explicitly awaiting the channel attached, and you cannot be sure the onChannelStateChanged
handler ran with this code path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mattheworiordan done 0a14d26
ably.connection.connectionManager.requestState(ConnectionState.disconnected); | ||
|
||
/* Since newTtl + newIdleInterval has passed we expect the connection to go into a suspended state */ | ||
connectionWaiter.waitFor(ConnectionState.suspended); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure I agree with this, as above... I think the connection state should be lost, but there is no reason it should become suspended according to the spec.
} | ||
}); | ||
} | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are not waiting for those to close, this entire section could easily never be run as part of this test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mattheworiordan done 77f3cf8
I think you're misunderstanding what this is doing. This is the check, that already existed, which determines which state to enter ( The only change here that is being made in this PR is that the staleness check is being used at the time of a new connection attempt, in addition to at the time of a disconnection or failed connection attempt. |
e973176
to
3a868ee
Compare
…attached channels do not enter suspended when handling a discontinuity; add a private method on ConnectionManager to support tests by dropping a connection and not automatically retrying it
… ensure that the resulting (clean) connection is handled as a resume failure, triggering reattach for discontinuity of attached channels
d964888
to
0521e14
Compare
@paddybyers @mattheworiordan ready for another look |
Note that this will be merged into https://github.com/ably/ably-java/tree/add-connectionStateTtl, which is part of another ongoing PR #389 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAICT this looks good. However, do you know why the build failed?
connectionWaiter.waitFor(ConnectionState.disconnected); | ||
assertEquals("Disconnected state was not reached", ConnectionState.disconnected, ably.connection.state); | ||
System.out.println("=============================== DISCONNECTED "); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are these for lines for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That were meant for debugging but have been deleted in this commit 6ab31f7#diff-4cfebe2518c34c9f3975cde19f3ced08L598
@@ -603,7 +618,10 @@ public void onChannelStateChanged(ChannelStateChange stateChange) { | |||
ably.close(); | |||
} | |||
}); | |||
(new Helpers.ChannelWaiter(channel)).waitFor(ChannelState.attached); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will you please sort out your IDE/editor? It's getting tiring spending time commenting on whitespace I am afraid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also this is an old commit and is now fixed.
@mattheworiordan the build failed because there are still issues with crypto tests. Ongoing PR here. And some issues with history on realtime. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think one change remaining as commented but otherwise LGTM.
@paddybyers done 5da11f6 |
5da11f6
to
810cef4
Compare
810cef4
to
b681cd8
Compare
Fix #358
lastActivityTime
ATTACHED
state even when the connection freshness check disposes of the old connection and creates a new one.