Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Failures to download signing keys don't follow a sensible retry schedule? #5413

Closed
ara4n opened this issue Jun 10, 2019 · 3 comments
Closed

Comments

@ara4n
Copy link
Member

ara4n commented Jun 10, 2019

I'm looking for scenarios where my server aggressively hammers servers which are unavailable, rather than backing off nicely. One such scenario seems to be (based on grepping arasphere's logs for nkli.net):

2019-06-09 05:50:27,608 - synapse.http.matrixfederationclient - 384 - INFO - PUT-171547 - {GET-O-1843} [nkli.net] Sending request: GET matrix://nkli.net/_matrix/key/v2/server/ed25519%3Aa_SbHJ; timeout 60.000000s
2019-06-09 05:50:27,623 - synapse.http.federation.matrix_federation_agent - 373 - INFO - PUT-171547 - Connecting to riot.nkli.net:8448
2019-06-09 05:50:27,954 - synapse.http.matrixfederationclient - 465 - WARNING - PUT-171547 - {GET-O-1843} [nkli.net] Request failed: GET matrix://nkli.net/_matrix/key/v2/server/ed25519%3Aa_SbHJ: ConnectionRefusedError('Connection refused',)
2019-06-09 05:50:28,439 - synapse.http.matrixfederationclient - 384 - INFO - PUT-171547 - {GET-O-1843} [nkli.net] Sending request: GET matrix://nkli.net/_matrix/key/v2/server/ed25519%3Aa_SbHJ; timeout 60.000000s
2019-06-09 05:50:28,441 - synapse.http.federation.matrix_federation_agent - 373 - INFO - PUT-171547 - Connecting to riot.nkli.net:8448
2019-06-09 05:50:28,768 - synapse.http.matrixfederationclient - 465 - WARNING - PUT-171547 - {GET-O-1843} [nkli.net] Request failed: GET matrix://nkli.net/_matrix/key/v2/server/ed25519%3Aa_SbHJ: ConnectionRefusedError('Connection refused',)
2019-06-09 05:50:29,592 - synapse.http.matrixfederationclient - 384 - INFO - PUT-171547 - {GET-O-1843} [nkli.net] Sending request: GET matrix://nkli.net/_matrix/key/v2/server/ed25519%3Aa_SbHJ; timeout 60.000000s
2019-06-09 05:50:29,593 - synapse.http.federation.matrix_federation_agent - 373 - INFO - PUT-171547 - Connecting to riot.nkli.net:8448
2019-06-09 05:50:29,919 - synapse.http.matrixfederationclient - 465 - WARNING - PUT-171547 - {GET-O-1843} [nkli.net] Request failed: GET matrix://nkli.net/_matrix/key/v2/server/ed25519%3Aa_SbHJ: ConnectionRefusedError('Connection refused',)
2019-06-09 05:50:32,005 - synapse.http.matrixfederationclient - 384 - INFO - PUT-171547 - {GET-O-1843} [nkli.net] Sending request: GET matrix://nkli.net/_matrix/key/v2/server/ed25519%3Aa_SbHJ; timeout 60.000000s
2019-06-09 05:50:32,006 - synapse.http.federation.matrix_federation_agent - 373 - INFO - PUT-171547 - Connecting to riot.nkli.net:8448
2019-06-09 05:50:32,331 - synapse.http.matrixfederationclient - 465 - WARNING - PUT-171547 - {GET-O-1843} [nkli.net] Request failed: GET matrix://nkli.net/_matrix/key/v2/server/ed25519%3Aa_SbHJ: ConnectionRefusedError('Connection refused',)
2019-06-09 05:50:32,331 - synapse.util.retryutils - 193 - INFO - PUT-171547 - Connection to nkli.net was unsuccessful (<class 'synapse.api.errors.RequestSendFailed'>(Failed to send request: ConnectionRefusedError: Connection was refused by other side: 111: Connection refused.)); backoff now 15000000
2019-06-09 05:50:32,333 - synapse.crypto.keyring - 716 - WARNING - PUT-171547 - Failed to download keys for nkli.net: KeyLookupError Failed to connect to remote server
2019-06-09 05:50:32,333 - synapse.federation.transport.server - 278 - WARNING - PUT-171547 - authenticate_request failed: 502: Error downloading keys for nkli.net
2019-06-09 05:50:32,334 - synapse.http.server - 85 - INFO - PUT-171547 - <SynapseRequest at 0x7f1e2b7df4e0 method='PUT' uri='/_matrix/federation/v1/send/1558773746107' clientproto='HTTP/1.1' site=8448> SynapseError: 502 - Error downloading keys for nkli.net
2019-06-09 06:15:28,323 - synapse.http.matrixfederationclient - 384 - INFO - PUT-173051 - {GET-O-1845} [nkli.net] Sending request: GET matrix://nkli.net/_matrix/key/v2/server/ed25519%3Aa_SbHJ; timeout 60.000000s
2019-06-09 06:15:28,323 - synapse.http.federation.matrix_federation_agent - 323 - INFO - PUT-173051 - Fetching https://nkli.net/.well-known/matrix/server
2019-06-09 06:15:28,605 - synapse.http.federation.matrix_federation_agent - 339 - INFO - PUT-173051 - Error fetching https://nkli.net/.well-known/matrix/server: Non-200 response 404
2019-06-09 06:15:28,625 - synapse.http.federation.matrix_federation_agent - 373 - INFO - PUT-173051 - Connecting to riot.nkli.net:8448
2019-06-09 06:15:28,957 - synapse.http.matrixfederationclient - 465 - WARNING - PUT-173051 - {GET-O-1845} [nkli.net] Request failed: GET matrix://nkli.net/_matrix/key/v2/server/ed25519%3Aa_SbHJ: ConnectionRefusedError('Connection refused',)
2019-06-09 06:15:29,625 - synapse.http.matrixfederationclient - 384 - INFO - PUT-173051 - {GET-O-1845} [nkli.net] Sending request: GET matrix://nkli.net/_matrix/key/v2/server/ed25519%3Aa_SbHJ; timeout 60.000000s
2019-06-09 06:15:29,626 - synapse.http.federation.matrix_federation_agent - 373 - INFO - PUT-173051 - Connecting to riot.nkli.net:8448
2019-06-09 06:15:29,952 - synapse.http.matrixfederationclient - 465 - WARNING - PUT-173051 - {GET-O-1845} [nkli.net] Request failed: GET matrix://nkli.net/_matrix/key/v2/server/ed25519%3Aa_SbHJ: ConnectionRefusedError('Connection refused',)
2019-06-09 06:15:31,047 - synapse.http.matrixfederationclient - 384 - INFO - PUT-173051 - {GET-O-1845} [nkli.net] Sending request: GET matrix://nkli.net/_matrix/key/v2/server/ed25519%3Aa_SbHJ; timeout 60.000000s
2019-06-09 06:15:31,047 - synapse.http.federation.matrix_federation_agent - 373 - INFO - PUT-173051 - Connecting to riot.nkli.net:8448
2019-06-09 06:15:31,374 - synapse.http.matrixfederationclient - 465 - WARNING - PUT-173051 - {GET-O-1845} [nkli.net] Request failed: GET matrix://nkli.net/_matrix/key/v2/server/ed25519%3Aa_SbHJ: ConnectionRefusedError('Connection refused',)
2019-06-09 06:15:33,220 - synapse.http.matrixfederationclient - 384 - INFO - PUT-173051 - {GET-O-1845} [nkli.net] Sending request: GET matrix://nkli.net/_matrix/key/v2/server/ed25519%3Aa_SbHJ; timeout 60.000000s
2019-06-09 06:15:33,223 - synapse.http.federation.matrix_federation_agent - 373 - INFO - PUT-173051 - Connecting to riot.nkli.net:8448
2019-06-09 06:15:33,550 - synapse.http.matrixfederationclient - 465 - WARNING - PUT-173051 - {GET-O-1845} [nkli.net] Request failed: GET matrix://nkli.net/_matrix/key/v2/server/ed25519%3Aa_SbHJ: ConnectionRefusedError('Connection refused',)
2019-06-09 06:15:33,550 - synapse.util.retryutils - 193 - INFO - PUT-173051 - Connection to nkli.net was unsuccessful (<class 'synapse.api.errors.RequestSendFailed'>(Failed to send request: ConnectionRefusedError: Connection was refused by other side: 111: Connection refused.)); backoff now 0
2019-06-09 06:15:33,552 - synapse.crypto.keyring - 716 - WARNING - PUT-173051 - Failed to download keys for nkli.net: KeyLookupError Failed to connect to remote server
2019-06-09 06:15:33,552 - synapse.federation.transport.server - 278 - WARNING - PUT-173051 - authenticate_request failed: 502: Error downloading keys for nkli.net
2019-06-09 06:15:33,553 - synapse.http.server - 85 - INFO - PUT-173051 - <SynapseRequest at 0x7f1e267aa4a8 method='PUT' uri='/_matrix/federation/v1/send/1558773746169' clientproto='HTTP/1.1' site=8448> SynapseError: 502 - Error downloading keys for nkli.net
2019-06-09 07:09:26,607 - synapse.http.matrixfederationclient - 384 - INFO - PUT-176747 - {GET-O-1849} [nkli.net] Sending request: GET matrix://nkli.net/_matrix/key/v2/server/ed25519%3Aa_SbHJ; timeout 60.000000s
2019-06-09 07:09:26,623 - synapse.http.federation.matrix_federation_agent - 373 - INFO - PUT-176747 - Connecting to riot.nkli.net:8448

Now, it says backoff now 15000000 (4 hours), but in practice it tries again 25 minutes later. This is quite reminiscent of #1404.

@richvdh
Copy link
Member

richvdh commented Jun 12, 2019

Hum. If this log was captured on a 1.0 rc, I'm concerned. Can you confirm the version, please?

@ara4n
Copy link
Member Author

ara4n commented Jun 12, 2019

will doublecheck. it's possible i did something stupid like git pull on master and expect the 1.0-rc to turn up. i'm definitely on 1.0.0 now, but the server's been down so don't have good logs yet.

@richvdh
Copy link
Member

richvdh commented Jun 18, 2019

Ok, I'm going to assume, in the absence of evidence to the contrary, that these logs were gathered on a 0.99.x version; there are therefore a bunch of red herrings there.

However it is still the case that, as the subject of the issue says: "Failures to download signing keys don't follow a sensible retry schedule". They don't, and we deliberately ignore the per-server federation backoff for them. We can therefore end up putting a lot of effort into trying to fetch unavailable keys.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants