-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Timed out connection not closed? #24
Comments
Closed events are logged on debug level usually. Did you tried to tune log level to 'DEBUG'? Or, you can try to override closeLogic method of WebSocketServerConnection to:
|
Yea I am running at debug level, it never receives Http.ConnectionClosed (or ErrorClosed or PeerClosed that I'm also listening for). |
I looked into this further, the OS (amazon linux in this case) doesn't recognize the connection as timed out on the server side, it remains in the ESTABLISHED state. I tuned the kernel tcp keep alive parameters much more aggressively (60s, 10s, 4 probes) but didn't have much luck. |
We may need to implement some application-level (here, websocket layer) timeout, like spray-can for Http here: http://spray.io/documentation/1.2.1/spray-can/http-server/
|
Are those timeouts not in effect when using UHttp extension? Or when the connections have been upgraded only Websocket only? I'm wondering because I was unable to use both extensions at the same time, so using the UHttp extension for serving HTTP content also:
this didn't work:
I also noticed this enhancement in Spray/Akka HTTP, wonder if it will be helpful here at all: |
Those timeouts work under http. When upgraded to WebSocket, the original http connection (tcp) will shift to new io pipelines which drops all http event/command processing. So, for
The http bind on 8080 will be effected by those timeouts settings, on WebSocket binding, no. |
Correction: I looked code of spray.io and spray.can again, here's the correction of above errata: The pipeline stages of Http are:
Vs the WebSocket:
As the WebSocket pipelines also staged ConnectionTimeouts and PreventHalfClosedConnections, the only difference to Http is requestTimeout, which only processed by Http's stages. I.e. WebSocket will also process idleTimeout. we did a simple test for: When the client machine running Browser break network to Server, by plugging out the network cable. The server will auto-disconnected after 60 secs, just as the settings of idleTimeout. But for the case @nefilim mentioned: It's the server pushing msg to browser client. When network broken, on the server side, the server is still keeping to write data to the Tcp socket, so it's not idle and the idleTimeout may not work, and Akka IO just firing CommandFailed(Write) event. I'm not sure yet, why Akka IO does not detected the broken of network and fires some XXXClosed events. We'll dig it. Or, you can choose to capture the CommandFailed events and disconnect it positively as a temporary solution. To disconnect the socket manually, you can send Tcp.Close message to serverConnection from your worker. |
Thanks for looking. I had the same thought re timeouts, the periodic server push is keeping it alive, hence another heuristic like X failed writes in Y time. I'm wondering if this is how they plan on implementing spray/spray#615 Unfortunately nothing is received at my actors, I think Spray is swallowing that event: def baseEventPipeline(tcpConnection: ActorRef): Pipeline[Event] = {
case x: Tcp.ConnectionClosed ⇒
log.debug("Connection was {}, awaiting TcpConnection termination...", x)
context.become {
case Terminated(`tcpConnection`) ⇒
log.debug("TcpConnection terminated, stopping")
context.stop(self)
}
case _: Droppable ⇒ // don't warn
case ev ⇒ log.warning("event pipeline: dropped {}", ev) // <==== SWALLOWED?
} unless I'm doing something wrong :) I have a few small issues/possible improvements around closing connections, I'll open a pull request. |
Did you try on the newest committed code? at least after ee08024 |
Ah, very nice thanks! I'm handling FrameCommandFailed appropriately now on my end. One curiosity I've observed, a connection created with Chrome takes a lot longer to time out (~2.5 minutes vs ~10 seconds) server side than a connection created by Safari, 100% repeatable in my scenario. Also, the Chrome created connection write fails with:
while the Safari created connection fails with:
both are text frames with exactly the same format JSON, I'm not sure what is going on with the Chrome created connection, binary encoded perhaps? |
This one is stinging me very reproducibly for Am I right in reading that the only way to increase the timeout settings is to do this?
is there really no other way, like setting these config values
If I am getting timeouts, how can I be sure? I'm not seeing anything except the |
what am I talking about... I'm on the client side, so I should be setting a client side timeout! |
ok, bumping the timeouts doesn't make a difference. I start getting Any ideas what could be going on? |
I'm hijacking your ticket, I'll create a new one with an example of what I'm seeing. |
Running into a strange issue, I'll try to summarize as briefly as possible.
I have two servers, M(onitor) & S(erver). M is receiving data from S, once per second, over a spray-websocket connection.
Browser client connects to M over another web socket connection. As S pushes data to M, it's pushed out to the browser client. If the browser client disappears (I simulate it by quitting my VPN client that browser connects to M over), the data from S fails to be written at M over the web socket connection it had with the browser client.
Brower <---- (websocket/vpn) ---- Monitor <---- (websocket) ---- Server
Spray logs warnings endlessly, it doesn't appear that the connection gets cleaned up:
2014-04-15 18:17:41,701 WARN [ReportingActorSystem-akka.actor.default-dispatcher-10] s.c.s.HttpServerConnection [Slf4jLogger.scala : 71] CommandFailed for Tcp.Write text frame: {"node":"10.0.20.202","up":true,"metrics":{"buildI ...
2014-04-15 18:17:41,704 WARN [ReportingActorSystem-akka.actor.default-dispatcher-10] s.c.s.HttpServerConnection [Slf4jLogger.scala : 71] event pipeline: dropped CommandFailed(Write(ByteString(-127, 126, 0, -69, 123, 34, 110, 111, 100, 101, 34, 58, 34, 49, 48, 46, 48, 46, 50, 48, 46, 50, 48, 50, 34, 44, 34, 117, 112, 34, 58, 116, 114, 117, 101, 44, 34, 109, 101, 116, 114, 105, 99, 115, 34, 58, 123, 34, 98, 117, 105, 108, 100, 73, 110, 102, 111, 34, 58, 123, 34, 99, 111, 109, 112, 111, 110, 101, 110, 116, 78, 97, 109, 101, 34, 58, 34, 119, 111, 114, 107, 101, 114, 34, 44, 34, 99, 111, 109, 112, 111, 110, 101, 110, 116, 86, 101, 114, 115, 105, 111, 110, 34, 58, 34, 48, 46, 49, 46, 50, 45, 83, 78, 65, 80, 83, 72, 79, 84, 34, 44, 34, 98, 117, 105, 108, 100, 84, 105, 109, 101, 34, 58, 34, 84, 104, 117, 32, 65, 112, 114, 32, 49, 48, 32, 49, 57, 58, 50, 49, 58, 52, 56, 32, 80, 68, 84, 32, 50, 48, 49, 52, 34, 44, 34, 117, 112, 116, 105, 109, 101, 77, 105, 108, 108, 105, 115, 34, 58, 50, 54, 55, 57, 55, 57, 48, 53, 57, 125, 125, 125),NoAck(null)))
I don't get any (unhandled) messages at my WebSocketServerConnection (worker) or my web socket server actor (that created the worker), no indication that the connection is unavailable.
Have you witnessed this issue?
The text was updated successfully, but these errors were encountered: