Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider using store for serialization? #5

Closed
mgsloan opened this issue Jun 3, 2016 · 3 comments
Closed

Consider using store for serialization? #5

mgsloan opened this issue Jun 3, 2016 · 3 comments

Comments

@mgsloan
Copy link

mgsloan commented Jun 3, 2016

Awesome project! One of my first published haskell projects was an attempt at doing protobufs well: https://github.com/mgsloan/sproto

Anyway, if making this fast is a priority, then you might be interested in porting it to store, as it's almost certainly faster than attoparsec.

It's still new, mgsloan/store#36 probably ought to be implemented before it's a responsible choice for this lib.

@judah
Copy link
Collaborator

judah commented Jun 3, 2016

Hey, thanks for the suggestion! Once we've added some benchmarks we can try comparing the performance of different serialization libs. Though I suspect that currently, the main bottleneck is the actual proto-lens code that does the encoding/decoding and the intermediate data structures that it uses.

There's a couple other features we'd probably need in order to try this out; I couldn't immediately tell from the docs if they're supported by store:

  • Detect the end of input (e.g., Data.Attoparsec.ByteString.endOfInput), since the proto encoding doesn't specify its length.
  • Get a substring of a specific length, e.g., Data.Attoparsec.ByteString.take; or, even more convenient, isolate a parse to a specific subsequence of the data, like Data.Binary.Get.isolate. This helps to parse sub-messages.

@mgsloan
Copy link
Author

mgsloan commented Jun 3, 2016

Detect the end of input (e.g., Data.Attoparsec.ByteString.endOfInput), since the proto encoding doesn't specify its length.

The decode family of functions require that the the Peek action consume exactly the whole input, no more, no less.

Get a substring of a specific length, e.g., Data.Attoparsec.ByteString.take; or, even more convenient, isolate a parse to a specific subsequence of the data, like Data.Binary.Get.isolate. This helps to parse sub-messages.

Having known-size extraction of sequences of bytes is covered by mgsloan/store#40

We do have isolate, but it's mostly just a way to skip forward those bytes and ensure the inner peek didn't go beyond. It's still the responsibility of the inner Peek to not advance beyond the isolated region (but that gets checked).

@judah
Copy link
Collaborator

judah commented Jan 28, 2019

I'm closing this ticket since we ended up using our own custom parser monad.

Its implementation is fairly similar to that of store. Compare:
http://hackage.haskell.org/package/store-core-0.4.4/docs/Data-Store-Core.html#t:Peek
https://github.com/google/proto-lens/blob/master/proto-lens/src/Data/ProtoLens/Encoding/Parser/Internal.hs#L14

In addition to the issues mentioned above, the main blocker was the lack of support for endian-specific numeric types (mgsloan/store#31).

@judah judah closed this as completed Jan 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants