Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Catch failures to contact remote homeserver for /register #456

Merged
merged 7 commits into from
Nov 8, 2021

Conversation

DMRobertson
Copy link
Contributor

@DMRobertson DMRobertson commented Nov 3, 2021

Make it and the other "the remove homeserver did something silly" cases explicitly return 500. Log a warning message so that we can see the problem in logs, but not in sentry. This addresses the bulk of the errors spotted by matrix-org/matrix-spec#40.

The spec has nothing to say about status codes in the failure case. (matrix-org/matrix-doc#3470)

@DMRobertson DMRobertson requested a review from a team as a code owner November 3, 2021 14:37
@DMRobertson DMRobertson marked this pull request as draft November 3, 2021 14:48
@DMRobertson
Copy link
Contributor Author

Another sentry error comes from this request timing out. I'll extend this PR to handle that case too.

@DMRobertson DMRobertson changed the title Catch DNS lookup failure in registerservlet Catch failures to contact remote homeserver for /register Nov 3, 2021
@DMRobertson DMRobertson force-pushed the dmr/sentry/register-report-bad-gateway branch from 40ebff3 to c1c6067 Compare November 3, 2021 19:57
@DMRobertson DMRobertson marked this pull request as ready for review November 3, 2021 20:01
Copy link
Member

@clokep clokep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable overall!

@@ -1,8 +1,9 @@
from typing import BinaryIO, Optional, Type, TypeVar
from typing import BinaryIO, Optional, Sequence, Type, TypeVar
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably include license headers in the stubs files?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm---possibly? They feel more like metadata derived from twisted rather than original creation of ourselves. Not sure what the right thing to do from a licensing perspective is there.

Comment on lines 81 to 84
return {
"errcode": "M_UNKNOWN",
"error": "The Matrix homeserver did not include 'sub' in its response",
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think clients don't usually expect JSON bodies for 500 errors so something feels off to me?

I guess this is similar to what we had before but now we'll not error to sentry?

Do we want to log a warning or anything in this case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think clients don't usually expect JSON bodies for 500 errors so something feels off to me?

Very possibly. I'm just trying to replicate what we had before (and also try to be a good citizen and provide error context).

I guess this is similar to what we had before but now we'll not error to sentry?

That's right. I didn't think it merited a sentry warning since it's not a problem in our application---it's duff data from the far end.

I think a warning would be prudent so that we'll have something in the logs to diagnose this. I added something for this on line 72. I'll add something for this and the cases below.

tests/test_register.py Outdated Show resolved Hide resolved
sydent/http/servlets/registerservlet.py Outdated Show resolved Hide resolved
David Robertson added 2 commits November 5, 2021 14:05
No more exceptions. Let's log something so it's not silent.
@DMRobertson DMRobertson requested a review from clokep November 5, 2021 14:22
@DMRobertson DMRobertson force-pushed the dmr/sentry/register-report-bad-gateway branch from ffbbc6c to c23fc62 Compare November 5, 2021 15:59
Copy link
Member

@clokep clokep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced about the 500 error, but I think at need clarification from the spec side there.

@DMRobertson
Copy link
Contributor Author

I'm not convinced about the 500 error, but I think at need clarification from the spec side there.

FWIW I'm not very happy with it either, but at least this is consistent with the existing behaviour.

@DMRobertson DMRobertson merged commit 001c046 into main Nov 8, 2021
@DMRobertson DMRobertson deleted the dmr/sentry/register-report-bad-gateway branch November 8, 2021 17:36
DMRobertson pushed a commit that referenced this pull request Nov 9, 2021
Like #456, but for another error case.

If the remote homeserver fails to give us sensible response, that's not
a problem in our application. We can continue to serve everyone else. We
should tell the requester what happened and log a warning, but not an
error to Sentry.

Most of the examples I saw in Sentry were 404 responses serving up a
generic "page not found" body.
DMRobertson pushed a commit that referenced this pull request Nov 10, 2021
* Catch failures to decode JSON in /register

Like #456, but for another error case.

If the remote homeserver fails to give us sensible response, that's not
a problem in our application. We can continue to serve everyone else. We
should tell the requester what happened and log a warning, but not an
error to Sentry.

Most of the examples I saw in Sentry were 404 responses serving up a
generic "page not found" body.
DMRobertson pushed a commit that referenced this pull request Nov 11, 2021
Addresses
https://sentry.matrix.org/sentry/sydent/issues/235004/?query=is%3Aunresolved%20event.timestamp%3A%3E%3D2021-11-03T10%3A00%3A00

That issue is rare---it's only occurred twice---but since it's so close
to #456 I thought I may as well throw it in too.

(I suppose this might encounter the problem in #463 as well, though we
haven't seen it in Sentry. I think unbinds are just much less used that
registers. I don't know if there's a better way to handle all this
duplication though?)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants