-
Notifications
You must be signed in to change notification settings - Fork 281
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Recommend proto3 #465
Comments
Isn't that potentially a problem? If I explicitly set a field, and the value just happens to coincide with the default value, I'd want it to be serialized, no? EDIT: I might be misunderstanding, does this only apply to non-optional fields? That would make sense then, since if a field is non-optional (= required), there's no way it could have been left empty. |
Yes - more correctly it applies to On deserialization to their object form both On serialization Consequently |
yup: here are some example tests: https://github.com/MarcoPolo/proto2and3-playground/blob/main/src/main.rs#L122. proto3 optional bytes behave exactly like proto2 optional bytes. |
A couple of things that may help clarify (I'm talking about proto3 here):
With the above, hopefully it's clear there is no "required" notion in proto3. A field that is not marked |
Not failing decode on missing required fields is not a good thing, now you need extra logic in the client for this and a giant footgun. |
You probably already have some logic in the client side checking if the value is even appropriate. Having required fields makes backwards/forwards compatibility hard because a required field is forever. This isn't just my opinion, there's a lot written about this:
from: https://developers.google.com/protocol-buffers/docs/proto#specifying-rules And more:
|
Thanks @MarcoPolo for the research and the elaborate description. One thing to note is that proto3 only supports I am fine with libp2p moving to proto3. I am sorry for being the source of the confusion on presence in proto2 and proto3. |
No need to be sorry. This is not easy to reason about. I don't think we should pay a lot of attention to what Debian does. I don't see any good justification for their focus on "stability" (aka outdated software). For example, they ship with Go 1.15, which was released in Aug 2020 and has been unmaintained for more than a year now. I'm wondering if we should move all of our existing protobufs to proto3 as well. It would be nice to be consistent across our entire stack, and we could get rid of proto2 compiler dependencies. We'd have to check that this can be done in a backwards-compatible way in all our protocols though. |
It certainly is not. I discovered the other day that the official protobuf.js module doesn't handle default values properly when deserializing "singular" fields so even Google don't get it right sometimes and it's their spec.
I've been doing this with the js stack as we took the decision to only support proto3 in protons and it's mostly been ok. One oddity is when a proto2 field has been marked as 🤪 |
Is it correct to think about it in the following way:
|
Yes, but with one gotcha that the default value for Message fields is for them to be unset, the exact value of which is language-dependent.
According to the spec all fields should be set to their default value upon deserialization (even If the field is marked If the field is |
Allows future versions to deprecate and remove fields in a two step process. For more details see #465 (comment)
There seems to have been a misunderstanding in the past around proto2 vs proto3. My attempt here is to clear up the confusion, recommend proto3 in general, and explain why proto3 should be preferred.
Our main confusion is about field presence. That is, if a field is omitted from the serialized wire format does the user of the decoded message know the difference between if the field was unset or set as the default value. This document has a lot of good information and is worth the read: https://github.com/protocolbuffers/protobuf/blob/main/docs/field_presence.md
Origins of the confusion
Proto2 would always serialize an explicitly set field, even if it was set to the default. This meant that you could know on the decoding side whether the field was set or not. This is called Explicit Presence. For example, in the Rust protobuf compiler, it would wrap these in
Option<T>
: https://github.com/tokio-rs/prost#field-modifiers.The confusing thing is that the language guide for proto2 states:
The subtlety here is that this doesn't say anything about "hasField" accessors. Which may be provided by the implementation to check if the field was set or not. This is essentially with prost is doing with
Option<T>
types.Another confusing thing is that this language guide doesn't mention "presence" a single time. Which is what we're talking about here.
In proto3, if a field was set to its default value it would not be serialized. This meant that the decoding sided wouldn't know if the field was omitted because it was unset or because it was the default value. This is called No Presence.
Field Presence Proto2 vs Proto3
To clarify field presence in proto2 vs proto3:
From https://github.com/protocolbuffers/protobuf/blob/main/docs/field_presence.md#presence-in-proto2-apis
Proto2
Proto3
optional
Advantages in Proto3 compared to Proto2
required
modifierNext steps
README.md
The text was updated successfully, but these errors were encountered: