Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement UCAN-IPLD-coherent serialization/deserialization #20

Closed
expede opened this issue Sep 15, 2022 · 10 comments · Fixed by #28
Closed

Implement UCAN-IPLD-coherent serialization/deserialization #20

expede opened this issue Sep 15, 2022 · 10 comments · Fixed by #28
Assignees
Labels
enhancement New feature or request

Comments

@expede
Copy link
Member

expede commented Sep 15, 2022

IMO (feel free to disagree), following Postel's Law, UCAN libraries should be rigid in what they produce, and liberal in what they accept. To this end, it would be amazing if this library produced canonicalized UCANs compatible with ucan-ipld.

This gets us a few things:

  • Slick tooling (over time)
  • Storage savings form content addressing / deduplication / binary data
  • Convert UCANs into arbitrary formats (e.g. machine friendly, PB, CBOR, Arrow, etc)

We will still need to read in arbitrary JWTs, but the more canonicalized UCANs that exist, the better a world we live in 😉

@cdata cdata self-assigned this Sep 21, 2022
@cdata
Copy link
Member

cdata commented Sep 21, 2022

This is good timing, as I need these qualities to exist this very week. On it!

@cdata cdata added the enhancement New feature or request label Sep 21, 2022
@cdata cdata changed the title Canonicalize by default Implement UCAN-IPLD-coherent serialization/deserialization Sep 21, 2022
@expede
Copy link
Member Author

expede commented Sep 21, 2022

@cdata we're also going to split out the canonicalization spec into its own spec, so there's no actual dependency on ucan-ipld per se. If you happen to want to implement ucan-ipld I certainly won't stop you 😉

Out of curiosity, what's the use case for later this week?

@cdata
Copy link
Member

cdata commented Sep 21, 2022

I'm adding multi-device support (including revocations) and need canonical IPLD representations for UCANs that are stored in our DAG.

It's probably easiest for us to do it all in one go, but I'll take the allowance to constrain the scope under advisement 🙏

While I've got you:

There seems to be some nuance (or perhaps ambiguity) between how nb and fct is defined, but I can't tell if that is deliberate or perhaps unintended. Here is the relevant spec text: https://github.com/ucan-wg/ucan-ipld/blob/3d05c5710cebff35a7c77759d4e031c943583400/README.md?plain=1#L104-L118

Are a capability's nb and ucan's fct both maps? If the same, does map imply in the context of this spec that values are all the same type? Or, is nb actually meant to be a structure of arbitrary shape?

@expede
Copy link
Member Author

expede commented Sep 21, 2022

need canonical IPLD representations for UCANs that are stored in our DAG.

Ah interesting! You don't want to store them as raw bytes for granular deduplication reasons?

Are a capability's nb and ucan's fct both maps?

Ah, indeed that could be clearer in the core spec 💯

  • nb is an arbitrarily shaped map
  • fct is an array of JSON

@expede
Copy link
Member Author

expede commented Sep 21, 2022

The reasoning is that since nb is scoped to a capability, it can define its own semantics. fct is one level higher, so may need to serve many functions, and thus could have duplicate field names.

@cdata
Copy link
Member

cdata commented Sep 21, 2022

Ah interesting! You don't want to store them as raw bytes for granular deduplication reasons?

Maybe I do? Tell me more about how raw bytes gets me more granularity!

nb is an arbitrarily shaped map

Just to be especially clear: does arbitrarily shaped in this case mean 1) any type for key and/or value (but all keys have the same type and all values have the same type), or 2) it is a structure of any shape (e.g., keys are fields and each field may have a different type as its value)?

@cdata
Copy link
Member

cdata commented Sep 21, 2022

fct is an array of JSON

Does this have the potential to foul-up the canonical representation? Don't all the things in fct have to have a canonical representation as well?

@expede
Copy link
Member Author

expede commented Sep 21, 2022

Maybe I do? Tell me more about how raw bytes gets me more granularity!

To start: there's nothing wrong with encoding to IPLD. I'm just curious how much you need that level of structure.

Instead of encoding it to IPLD, you could treat a UCAN as a blob of bytes, same as an image or a text document. You still get content addressing etc, but not the ability to re-encode the same UCAN to (e.g.) CBOR and back. Canonicalization is extremely rigid in key ordering, whitespace, and capitalization to make this possible.

@expede
Copy link
Member Author

expede commented Sep 21, 2022

Does this have the potential to foul-up the canonical representation? Don't all the things in fct have to have a canonical representation as well?

They have to follow dag-json's encoding

  1. dag-json encoding MUST be used

does arbitrarily shaped in this case mean 1) any type for key and/or value (but all keys have the same type and all values have the same type), or 2) it is a structure of any shape (e.g., keys are fields and each field may have a different type as its value)?

2! Any JSON object.

@cdata
Copy link
Member

cdata commented Sep 21, 2022

Thank you, DAG-JSON was the clarification I was hoping for!

Once we have canonical (de)serialization in place, it opens the door to storing UCANs in many different representations without thinking too hard about it. We will approach the opportunity with an experimental mindset and discover what works best 🧫🔬

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants