-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Potential client/server state mismatch bugs #7843
Comments
#7838 too We need to spend some time on this. |
Okay... this seems sorta related, but is undermining its being a client/server state mismatch thing: |
what do you mean by "client/server state mismatch", and what makes you think that these issues are related to it? |
Missing room name, fixed by clear cache seems like another example of this, I think. |
I dug into a bunch of logs to try and figure out where #8136 might be happening and have had very little success.
The information is not revealing in that there's no definitive answer here. Given the data set however, the issue does appear to be more likely if you encounter database problems (corruption, full, etc) or if you get gappy syncs. This may just be confirmation bias in that the logs show these problems consistently, but cannot be proven to be the issue as of yet. I'd generally encourage people to submit more rageshakes for more data points. To expand on other data points for this issue's related issues, room complexity in terms of state and auth chain events does not appear to affect the probability of clashes happening. Given clients are apparently running into database problems and possible gappy syncs, I'm inclined to believe that these issues happen more often but are only noticed on high profile rooms. This is based on some of the reports happening on relatively tiny rooms (20-30 people, nothing particularly interesting in the room state) as well as massive rooms (HQ, #synapse, etc). We are probably still suffering state resets causing client state to get purged, and perhaps that is what is causing some of the "no issues found" reports above, but I do believe that ~50% of the problem is our fault as a client. |
ftr I spent a couple hours going through other rageshakes to hunt down ones that might not be associated with the set of issues here. Found nothing of real interest, but did find a bunch of trends. |
https://github.com/matrix-org/riot-web-rageshakes/issues/1328 has sync timeouts and other sync related errors on matrix.org - this might be more evidence that gappy syncs are indeed the problem. |
#9756 seems like another example. |
I am not convinced this sprawling meta issue has value at the moment... It's unlikely we would tackle this all together or that they would have a single solution. I have tagged the related open issues with a new For this meta issue, I think I'll go ahead and close it for now. |
My spider sense is tingling about these bugs:
#7526 (comment) - The github issue is resolved but I don't think we've addressed this comment in particular
#7745 - Some matrix.org users can't join #matrix:matrix.org - CORS request rejected
#7775 - Desktop app is hiding one of my rooms from me!
#7800 - Sending a message into a room fails with CORS rejected while doing a /members query
#7790 - Sometimes legit invitees cannot write in a room
Maybe:
#7352 - Joining a room you've previously left (in the same session) shows an infinite spinner
They all smell like a client/server state mismatch not being recovered from gracefully.
The text was updated successfully, but these errors were encountered: