Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Silent connection failure #88

Closed
ForNeVeR opened this issue Oct 8, 2019 · 2 comments · Fixed by #119
Closed

Silent connection failure #88

ForNeVeR opened this issue Oct 8, 2019 · 2 comments · Fixed by #119
Assignees

Comments

@ForNeVeR
Copy link
Member

ForNeVeR commented Oct 8, 2019

Today was the first time when Emulsion didn't connect to the server properly (probably due to the server errors).

Problems with that:

  1. Emulsion should really really detect such issues
  2. It looks like it was also ignoring the server errors when trying to send the messages

@ForNeVeR please investigate, the incident logs are copied.

@ForNeVeR ForNeVeR self-assigned this Oct 8, 2019
@ForNeVeR
Copy link
Member Author

ForNeVeR commented Oct 17, 2019

Here're the relevant logs:

{"Timestamp":"2019-10-17T06:07:34.2307705+00:00","Level":"Verbose","MessageTemplate":"XMPP stanza ({Direction}): {Stanza}","Properties":{"Direction":"incoming","Stanza":"<presence to=\"[email protected]/82039533-eeff-43b8-a953-e63906a1941c\" from=\"[email protected]/AkhIL\" id=\"78ed6584-846b-46b3-b0c4-ccf8284c03d3\" xml:lang=\"en\" xmlns=\"jabber:client\">\n  <show>xa</show>\n  <status> (Автостатус 'Недоступен' по причине бездействия более 15 мин)</status>\n  <idle xmlns=\"urn:xmpp:idle:1\" since=\"2019-10-17T05:54:50Z\" />\n  <c hash=\"sha-1\" ver=\"JYfsDpnrR9CgE/evuxsDCYpuMSU=\" node=\"http://gajim.org\" xmlns=\"http://jabber.org/protocol/caps\" />\n  <x xmlns=\"vcard-temp:x:update\">\n    <photo>04cd46bb6b5adca47eaa71346172c2fb40c037ac</photo>\n  </x>\n  <x xmlns=\"http://jabber.org/protocol/muc#user\">\n    <item affiliation=\"member\" role=\"participant\" />\n  </x>\n</presence>","EventCategory":"Xmpp"}}
{"Timestamp":"2019-10-17T07:27:40.5316065+00:00","Level":"Debug","MessageTemplate":"Current queue state: {State}","Properties":{"State":"{Messages =\n  seq [OutgoingMessage {author = \"@gsomix\";\n                        text = \"https://www.youtube.com/watch?v=RoGHVI-w9bE\";}];\n ClientReadyToSendMessages = true;}","EventCategory":"Xmpp"}}
{"Timestamp":"2019-10-17T07:27:40.5323071+00:00","Level":"Verbose","MessageTemplate":"XMPP stanza ({Direction}): {Stanza}","Properties":{"Direction":"outgoing","Stanza":"<message id=\"32d3bf88-e35c-4f13-a9fc-46e63aca3c4e\" type=\"groupchat\" to=\"[email protected]\" xmlns=\"jabber:client\">\n  <body>&lt;@gsomix&gt; https://www.youtube.com/watch?v=RoGHVI-w9bE</body>\n</message>","EventCategory":"Xmpp"}}
{"Timestamp":"2019-10-17T07:27:40.5329945+00:00","Level":"Information","MessageTemplate":"Message {MessageId} has been sent; awaiting delivery","Properties":{"MessageId":"32d3bf88-e35c-4f13-a9fc-46e63aca3c4e","EventCategory":"Xmpp"}}
{"Timestamp":"2019-10-17T07:27:41.4990368+00:00","Level":"Verbose","MessageTemplate":"XMPP stanza ({Direction}): {Stanza}","Properties":{"Direction":"incoming","Stanza":"<message id=\"32d3bf88-e35c-4f13-a9fc-46e63aca3c4e\" type=\"error\" to=\"[email protected]/82039533-eeff-43b8-a953-e63906a1941c\" from=\"[email protected]\" xmlns=\"jabber:client\">\n  <body>&lt;@gsomix&gt; https://www.youtube.com/watch?v=RoGHVI-w9bE</body>\n  <error type=\"modify\" code=\"406\">\n    <not-acceptable xmlns=\"urn:ietf:params:xml:ns:xmpp-stanzas\" />\n    <text xmlns=\"urn:ietf:params:xml:ns:xmpp-stanzas\">Только присутствующим разрешается посылать сообщения в конференцию</text>\n  </error>\n</message>","EventCategory":"Xmpp"}}
{"Timestamp":"2019-10-17T07:27:41.5031525+00:00","Level":"Debug","MessageTemplate":"Current queue state: {State}","Properties":{"State":"{Messages = seq [];\n ClientReadyToSendMessages = true;}","EventCategory":"Xmpp"}}

(times below are in UTC unless time zone is specified)

  • at 06:07 we've received a status change from the XMPP user AkhIL
  • at 07:27 we've decided to send a message to the chat from Telegram
  • we received an error that told us that we aren't in the chat, and that's it, the bot never tried to join the chat again

Notable points:

  1. It looks like an error wasn't properly recognized by the bot: it removed the message from the queue even if it has received an error
  2. There weren't any observable XMPP messages between 06:07 and 07:27. The server didn't notified us of an outage.

Notably, I cannot even find anything useful in the server chat logs for that period: last message from тг was sent at 08:24:57+03:00, and at 17:14:43+03:00 is just "enters the room". There were no messages about it leaving the room though, it just suddenly enters as if it weren't there.

Necessary actions here:

  1. Properly detect such errors (see the message templates below) and report them back to the messaging infrastructure.
  2. Add some "ping" message (probably simply a presence to the room from time to time?) that will allow us to receive possible errors earlier (to properly rejoin the room in cases when there're no traffic from Telegram, but we want to observe the XMPP messages)

@ForNeVeR
Copy link
Member Author

Another interesting snippet found in the logs:

[18:55:44 VRB] XMPP stanza (outgoing): <message id="18ccd28b-d288-4ece-b9bf-6da1b4d157af" type="groupchat" to="[email protected]" xmlns="jabber:client">
  <body>&lt;@maxblock&gt; [@maxblock has entered the chat]</body>
</message>
[18:55:44 INF] Message 18ccd28b-d288-4ece-b9bf-6da1b4d157af has been sent; awaiting delivery
[18:57:14 VRB] XMPP stanza (incoming): <message id="18ccd28b-d288-4ece-b9bf-6da1b4d157af" type="error" to="[REDACTED]@codingteam.org.ru/[REDACTED]" from="[email protected]" xmlns="jabber:client">
  <error type="cancel">
    <remote-server-not-found xmlns="urn:ietf:params:xml:ns:xmpp-stanzas" />
    <text xmlns="urn:ietf:params:xml:ns:xmpp-stanzas">Server-to-server connection failed: connection-timeout</text>
  </error>
</message>

We should detect such kind of failure and reconnect to the room. Also, the message should count as failed.

ForNeVeR added a commit that referenced this issue Sep 19, 2020
ForNeVeR added a commit that referenced this issue Sep 19, 2020
ForNeVeR added a commit that referenced this issue Oct 1, 2020
ForNeVeR added a commit that referenced this issue Oct 1, 2020
ForNeVeR added a commit that referenced this issue Oct 1, 2020
@ForNeVeR ForNeVeR mentioned this issue Oct 1, 2020
1 task
ForNeVeR added a commit that referenced this issue Oct 1, 2020
ForNeVeR added a commit that referenced this issue Oct 3, 2020
ForNeVeR added a commit that referenced this issue Oct 3, 2020
ForNeVeR added a commit that referenced this issue Oct 3, 2020
ForNeVeR added a commit that referenced this issue Oct 31, 2020
ForNeVeR added a commit that referenced this issue Oct 31, 2020
ForNeVeR added a commit that referenced this issue Oct 31, 2020
ForNeVeR added a commit that referenced this issue Oct 31, 2020
ForNeVeR added a commit that referenced this issue Oct 31, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant