Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem connecting to custom namespace #1033

Closed
OneSman7 opened this issue Jun 6, 2018 · 10 comments
Closed

Problem connecting to custom namespace #1033

OneSman7 opened this issue Jun 6, 2018 · 10 comments

Comments

@OneSman7
Copy link
Contributor

OneSman7 commented Jun 6, 2018

I think we have a racing condition here. Sometimes request to connect to custom namespace (40/nsp) is not sent.
We debugged it to the method flushProbeWait() in SocketEngine.
It is called on web socket connect success in doFastUpgrade() and on web socket connect failure in websocketDidDisconnect(error: Error?) (in case we are polling which is always true on start). Inside this method we have this code:

if postWait.count != 0 {
            flushWaitingForPostToWebSocket()
        }

Method flushWaitingForPostToWebSocket() causes packets to be sent by web socket. And it fails if web socket failed to connect. In our case we lose request to connect to custom namespace and do not get connected at all.
I think it will be better to move this "if" from flushProbeWait() directly to doFastUpgrade(), so it will be called only on web socket connect success.

@OneSman7
Copy link
Contributor Author

OneSman7 commented Jun 6, 2018

Packets are lost if they were in postWait and web socket failed to connect, thus websocketDidDisconnect(error: Error?) is called.

@nuclearace
Copy link
Member

@OneSman7 That sounds plausible. Would you be able to try out your change, and if it looks good, raise a PR?

@OneSman7
Copy link
Contributor Author

OneSman7 commented Jun 7, 2018

@nuclearace Yes. I will prepare PR.

We are also using forceNew flag, so on each connect a new engine is created.

Before rebooting our server sends engine close message to all clients. But sometimes on server reconnect old SocketEnginePolling object survives and keeps polling receiving server errors, but not reporting error events since it has no client. Thus it hangs in infinite loop until server revives. Since that moment it pings server as usual along with working engine.

I am trying to isolate the issue and want to create PR that fixes both issues.

@OneSman7
Copy link
Contributor Author

OneSman7 commented Jun 7, 2018

LOG SocketEngine: Got message: 1
LOG SocketManager: Starting reconnect
LOG SocketIOClient{/chat_v1}: Handling event: statusChange with data: [connecting]
LOG SocketEnginePolling: Doing polling GET
LOG SocketIOClient{/chat_v1}: Handling event: reconnect with data: ["1"]
LOG SocketManager: Trying to reconnect
LOG SocketIOClient{/chat_v1}: Handling event: reconnectAttempt with data: [-1]
LOG SocketManager: Adding engine
LOG SocketEngine: Starting engine. Server: 
LOG SocketEngine: Handshaking

There is no "Engine is being released" message after "Adding engine" message :(

@nuclearace
Copy link
Member

Hmm I remember this leak can happen when the URLSessionDelegate (the engine in this case) isn’t cleaned up in the URLSession by calling invalidate. Which I think happens in stopPolling.

@OneSman7
Copy link
Contributor Author

OneSman7 commented Jun 7, 2018

Exactly! Thank you!

private func addEngine() {
        DefaultSocketLogger.Logger.log("Adding engine", type: SocketManager.logType)

        engine?.engineQueue.sync {
            self.engine?.client = nil
        }

        engine = SocketEngine(client: self, url: socketURL, config: config)
    }

When new engine is created, old one is left without client, but not disconnected. It then receives server error, so checkAndHandleEngineError is called.

private func checkAndHandleEngineError(_ msg: String) {
        do {
            let dict = try msg.toDictionary()
            guard let error = dict["message"] as? String else { return }

            /*
             0: Unknown transport
             1: Unknown sid
             2: Bad handshake request
             3: Bad request
             */
            didError(reason: error)
        } catch {
            client?.engineDidError(reason: "Got unknown error from server \(msg)")
        }
    }

Since no json is received client?.engineDidError() is called. But this engine already has no client. So it continues with polling. Forever :)

If this engine receives server error before it looses client, than client handles it (our code) and disconnects engine.

@nuclearace
Copy link
Member

So that guard block needs to have more logic in it to handle the case where the client is gone, I think.

@OneSman7
Copy link
Contributor Author

OneSman7 commented Jun 7, 2018

I tried to disconnect old engine explicitly when creating new one and it seems to work. This seems ok since we create new one only on connect/reconnect. What do you think?

private func addEngine() {
        DefaultSocketLogger.Logger.log("Adding engine", type: SocketManager.logType)

        engine?.engineQueue.sync {
            self.engine?.client = nil
            self.engine?.disconnect(reason: "Adding new engine")
        }

        engine = SocketEngine(client: self, url: socketURL, config: config)
    }

@nuclearace
Copy link
Member

I think that is a good idea. 👍

@OneSman7
Copy link
Contributor Author

OneSman7 commented Jun 7, 2018

Great 👍
Created a PR #1034.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants