-
-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JSON signing not appropriate for interoperability #1248
Comments
I think going the GMSL way is also going to be hard in other languages. I can imagine one way to do it in Rust without writing my own JSON parser, but it's really not at all straight-forward and I don't expect Python's JSON library to provide anything that lets you even strip out individual fields while keeping the exact format of others (and it might get harder if you want to strip whitespace inside object / array literals). I agree that this solution might be easier than the second though. |
It's hard in GMSL too — we had that pain in the past, implemented by hand because Go's JSON library provides nothing like Python's |
Maybe you should update the "Easy:" in the issue description then 😉 |
I think what you're proposing here is: rather than taking the received JSON, decoding it, and then re-encoding it as Canonical JSON, assume it is already (more-or-less) in the format in which it was signed, so just check the signature against the received bytes. It's not clear to me that we actually need to specify a "canonical" format at all in that case. We're just signing bytes, right? (Glossing over the fact that the signature is embedded in the JSON, which complicates things.) Either way, it's a completely different approach. (somewhat related: #255) |
Also: doesn't this still present the problem that different implementations might interpret the same JSON differently (eg: in the case of duplicate keys, some impls might honour the first key, and some the last)? |
I think we will still need the sorting of keys and the stripping of whitespace as a minimum. We still have to be able to modify the JSON to strip or insert the I'm mostly thinking this because, in Go for example, we can decode JSON without parsing the values (using I would fully expect plenty other JSON implementations to have their own unique ordering behaviours too which will need to be worked around by a post-encoding sort step either way.
If you were to rely on round-tripping via a JSON implementation to sort keys and strip whitespace then possibly (although round-tripping is admittedly what got us into this mess to begin with). GMSL for example takes a different approach to that by not round-tripping and instead manipulating JSON by hand to sort keys and strip whitespace, because that was the only thing we could do to make up for the lack of sorting in |
Do we? I can imagine an algorithm in which we strip out those keys without properly decoding and reencoding the JSON.
I'm confused, on two fronts.
|
That's perhaps true but it may not be very future-proof. If an implementation wanted to shortcut in that way then it shouldn't matter as long as the end result is still right.
GMSL can and does canonicalise JSON without round-tripping it through the The The
That's really just a spec precision issue. If the canonicalisation has to be implemented by hand to make it possible for all of us to agree on something, then we just need to be incredibly precise on what we're agreeing on — what constitutes the correct lexicographical ordering, the precedence for duplicate keys, that values are to be left alone, that very specific whitespace characters are to be stripped and so on. |
mmkay, though I'd argue that the difference between this approach and a "real" round-trip is vanishingly small. Effectively you're reading the json with
Well, I guess so, but then it's unclear what problem you're hoping to solve in this issue. If we still have to spec everything incredibly tightly, what do we gain by replacing json signing with something else? |
We currently attempt to hash and/or sign Canonical JSON in a number of places. Implementations (both clients and servers) are expected to re-canonicalise JSON that they receive in order to verify those hashes or signatures, but there are a number of places where JSON implementations might vary in their behaviours, i.e.
As such, unless every implementation is also able/willing to implement their own JSON encoders/decoders to match very specific differences in behaviour without also introducing new bugs, we run into a problem where different implementations written in different languages are going to arrive at different ideas of what a "canonicalised" JSON blob looks like.
These differences in encoding will result in different hashes or signatures, causing genuine things to be rejected as invalid. Guaranteeing 100% interoperability between any two implementations written in different languages is practically impossible as a result.
We have already seen this happen quite often in Dendrite/GMSL because of how GMSL specifically does not round-trip values when canonicalising JSON (instead just stripping whitespace and sorting keys), but this differs to the full round-tripping of the entire JSON that Synapse does, so we end up with unexpected signature errors.
Go's JSON library does not provide equivalents to
sort_keys
orensure_ascii
either, and it isn't clear that it would demonstrate the same behaviours as Python even if it did. The same types of problems will exist in other implementations or language standard libraries.Ideally we need one of two things:
(created from #1232)
The text was updated successfully, but these errors were encountered: