You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm writing to a format which consists of length-delimited proto messages (TensorFlow's TFRecord format) and I was having trouble with getting my output read correctly. Since I was writing in a stream, I used compute_size to get the length and write it to the stream before I write the actual data. Unfortunately, I noticed that the length returned was smaller than the actual number of bytes written.
After some debugging, I tracked it down to the following line (in this particular case, it's a packed repeated field of 32-bit floats):
Notice that my_size is the size of the entire message, in bytes. A length-delimited message consists of a tag (hence the 1), the length of the data contained in it (in bytes) and then the data itself (value.len() floats, each consisting of 4 bytes). The error here is that this function is calculating the size of the varint which would be used to encode the number of elements of the list, but actually it should calculate the size of the varint which encodes the number of bytes which the contents of the list take:
I've fixed this by hand in my generated files, since I only have a couple of lists in the protos I'm using, so I don't know where the bug is in the actual codegen, but hopefully this will allow someone to fix it.
The text was updated successfully, but these errors were encountered:
I'm writing to a format which consists of length-delimited proto messages (TensorFlow's TFRecord format) and I was having trouble with getting my output read correctly. Since I was writing in a stream, I used
compute_size
to get the length and write it to the stream before I write the actual data. Unfortunately, I noticed that the length returned was smaller than the actual number of bytes written.After some debugging, I tracked it down to the following line (in this particular case, it's a packed repeated field of 32-bit floats):
Notice that
my_size
is the size of the entire message, in bytes. A length-delimited message consists of a tag (hence the1
), the length of the data contained in it (in bytes) and then the data itself (value.len()
floats, each consisting of4
bytes). The error here is that this function is calculating the size of the varint which would be used to encode the number of elements of the list, but actually it should calculate the size of the varint which encodes the number of bytes which the contents of the list take:I've fixed this by hand in my generated files, since I only have a couple of lists in the protos I'm using, so I don't know where the bug is in the actual codegen, but hopefully this will allow someone to fix it.
The text was updated successfully, but these errors were encountered: