Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

application/did+dag+cbor not implementable? #551

Closed
msporny opened this issue Jan 17, 2021 · 51 comments
Closed

application/did+dag+cbor not implementable? #551

msporny opened this issue Jan 17, 2021 · 51 comments
Assignees
Labels
pr exists There is an open PR to address this issue

Comments

@msporny
Copy link
Member

msporny commented Jan 17, 2021

I finally found some time to analyze the canonicalization algorithm in the DagCBOR section. That section currently states:

DagCBOR requires that there exist a single way of encoding any given object, and that encoded forms contain no superfluous data that may be ignored or lost in a round-trip decode/encode.

There are currently no rules for how arrays must/should be sorted, which would lead to non-deterministic output. This non-deterministic output would result in digital signatures that would fail to validate (as some implementation languages that use sets do not impose a deterministic order).

It also deviates in subtle, but possibly important ways from the WebAuthn specification (which specifies size constraints on payload, sorting rules for keys with complex codepoints, etc.):

https://fidoalliance.org/specs/fido-v2.0-ps-20190130/fido-client-to-authenticator-protocol-v2.0-ps-20190130.html#ctap2-canonical-cbor-encoding-form

There is also no clear guidance on how to deal with floats and doubles. If the group wants to ensure that this section passes the Candidate Recommendation phase, we will need a test suite that demonstrates that the canonicalization algorithm defined in the specification works. To put this feat in perspective, the RDF Dataset Normalization test suite (another directed acyclic graph canonicalization algorithm) contains over 62 tests for the latest canonicalization algorithm: https://json-ld.github.io/normalization/tests/index.html

The questions to the implementation community are:

  1. Are you planning to implement the DagCBOR representation format?
  2. Are you committing to provide a test suite that tests the effectiveness of the DagCBOR canonicalization algorithm?
  3. Would you object to DagCBOR being moved into an extension specification and registered in the DID Specification Registries as an extension?

We need to hear from @jonnycrunch wrt. his plans for this section and get his help to get the implementers he noted on record as supporting this representation format.

@msporny
Copy link
Member Author

msporny commented Jan 17, 2021

On behalf of Digital Bazaar:

Are you planning to implement the DagCBOR representation format?

No.

Are you committing to provide a test suite that tests the effectiveness of the DagCBOR canonicalization algorithm?

No.

Would you object to DagCBOR being moved into an extension specification and registered in the DID Specification Registries as an extension?

No, we would not object.

@jonnycrunch
Copy link
Contributor

While I appreciate any potential security flaws, these aren't about limitations of dagCBOR, but limitation of the abstract date model and a recent PR with implications that haven't been discussed with the group. BTW, this section was supposed to be in the CBOR, not dagCBOR. I now take issue with your editorial license and believe you have made substantive changes that purposefully put the dagCBOR section at risk.

A CBOR array is ordered (just like an ordered set in infra) and how it is ordered should be in the specification. This issue was already brought up by me previously when we started down the direction of using Infra. I also brought this up for float and numbers and here and it was determined that BigNum, Doubles and Floats would not be in scope for the spec. So, those are non-issues.

@jonnycrunch
Copy link
Contributor

here is another issue that I brought when we started down using Infra Ordered Set and raised potential issues with the precision of numbers. Again this is a limitation of the ADM, not dagCBOR.

While the ordering of an ordered set ( or CBOR array) matters for deterministic output and digital signatures, we have to decide if it makes a semantically meaningful difference in the specification worthy of declaring an ordering rule.

So far as I understand the spec, we don't have any naked arrays of values where ordering has relevance and changes the semantics.

example where ordering matters:

coordinates = [ 37.22893685619454, -80.41454210726482 ]  

My take is that ordering of items in an ordered set is up to the DID document producer and has no impact on security.

However, if we decide that ordering semantically matters, then....

Since everything occurrence of an ordered set in the DID document is either:

  • an object and has an id
  • an array of URLs

the simple addition would be:

  • if ordered set contains a set of URL, order by string value
  • if ordered set contains a set of object, order by value of id or relative did URL

I personally don't think that ordering the set changes the semantics and leads to a security flaw, but welcome the discussion so we can close this.

@msporny Perhaps you can give some examples?

BTW, my understanding is that proof is not in scope for our WG, so WebAuthn doesn't have any relevance here, but again I welcome the discussion so that we are aligned.

@jonnycrunch
Copy link
Contributor

and here is the conclusion from the group when this was discussed on our WG call and the group agreed that bigNum and Int float would NOT be needed for our spec with multiple thumbs up including your CTO.

@msporny
Copy link
Member Author

msporny commented Jan 19, 2021

Again this is a limitation of the ADM, not dagCBOR.

No, the ADM doesn't specify a deterministic ordering because it doesn't define a lexical space (by design). The specification currently states (per group consensus):

NOTE: Ordering of values
As a result of the data model being defined using terminology from [INFRA], property values which can contain more than one item, such as maps and sets, are explicitly ordered. For the purposes of this specification, unless otherwise stated, ordering is not important and implementations are not expected to produce or consume deterministically ordered values.

It is the job of a canonicalization algorithm to do deterministic ordering and that ordering must be performed on a specific byte representation (aka a lexical space). It is the job of representations in DID Core to define byte representations (aka lexical spaces). Not all representations will choose the same byte representation; therefore, it is the job of each representation to define how it will achieve canonical ordering if that's required.

application/did+json doesn't do this, but would probably use some variation of JCS.

application/did+ld+json does this via the RDF Dataset Canonicalization algorithm.

application/did+dag+cbor does this via the canonicalization algorithm in the DagCBOR section.

Again, it is not the job of the ADM to do this. It is a representation specification's job to specify a canonicalization algorithm, if any. When it does that, it needs to specify how to get a deterministic ordering.

the ordering of an ordered set ( or CBOR array) matters for deterministic output and digital signatures

Yes, it does -- that's that's what this issue is about. I'm glad we agree on that point.

Not having a deterministic order creates security flaws (such as digital signatures sporadically not verifying, causing DID Documents encoded in DagCBOR that are signed to be rejected by resolvers, resulting in further downstream security concerns). I'm fairly certain most systems would fail closed, I'm concerned that some systems may not... we need more eyes on this if we are to reach consensus in the group that there isn't a security issue here.

Someone is going to have to write the tests that demonstrate that the DagCBOR canonicalization algorithm actually works.

we have to decide if it makes a semantically meaningful difference in the specification worthy of declaring an ordering rule.

The group has already decided that ordering is of no concern for the ADM. You could try to raise the issue again, but I expect we'll end up landing where we are today -- where canonical ordering is a representation concern.

@msporny
Copy link
Member Author

msporny commented Jan 19, 2021

here is the conclusion from the group when this was discussed on our WG call and the group agreed that bigNum and Int float

Your DagCBOR PR included the following issue marker, which remains in the specification today:

ISSUE: How to represent Floating-point values that can exceed the range or the precision IEEE 754. See issue #361.

Would you like to submit a PR that removes that issue marker? I'm also happy to remove the text if you're done wrt. float/double canonicalization.

@msporny
Copy link
Member Author

msporny commented Jan 19, 2021

Since everything occurrence of an ordered set in the DID document is either: an object and has an id, an array of URLs

You cannot assume that this is true for all things that can exist in a DID Document (or a DAG) -- there will be DID Spec Registry extensions where nested objects don't have IDs and that breaks the DagCBOR canonicalization algorithm.

@msporny
Copy link
Member Author

msporny commented Jan 19, 2021

BTW, my understanding is that proof is not in scope for our WG, so WebAuthn doesn't have any relevance here

WebAuthn's relevance is that it defines a canonicalization algorithm and it does a few things differently than yours does. I was including it so you could take a look at it and see if you missed anything extra. For example, their c14n algorithm states: "The representations of any floating-point values are not changed." ... but the DagCBOR algorithm is silent on what to do with non-double floating point values? Do you throw an error? Convert per some algorithm? Things like that matter when creating a deterministic canonical form.

@msporny Perhaps you can give some examples?

Take the verification method arrays and two implementations, for example:

Implementation A lists the following verification methods via a verification relationship: [C, A, B], Implementation B expresses: [B, A, C]. Currently, the DagCBOR algorithm doesn't order either array, which means that it generates two different forms, making the algorithm non-deterministic. Since the algorithm is non-deterministic, digital signatures over application/did+dag+cbor are going to sporadically based on the insert order for arrays.

@mikeal
Copy link

mikeal commented Jan 19, 2021

There are currently no rules for how arrays must/should be sorted, which would lead to non-deterministic output. This non-deterministic output would result in digital signatures that would fail to validate (as some implementation languages that use sets do not impose a deterministic order).

What?

There is no sort rule for arrays because the sorting of the values is part of the value and changing the sort changes the value. Arrays with different sorts are different values. If you’re using an array to represent a Set or some other data-structure which needs to represent itself as being unsorted then you need to apply a consistent sort to the array before you serialize it.

It is the job of a canonicalization algorithm to do deterministic ordering and that ordering must be performed on a specific byte representation (aka a lexical space).

Just pick a sort (byte comparison for example) and put in the spec that these unordered structures need to sort to it before serializing to dag-cbor. Now you have a deterministic byte representation.

Is it your expectation that the format would include an un-odered set that matches the sorting conditions you happen to have here?

@mikeal
Copy link

mikeal commented Jan 19, 2021

The representations of any floating-point values are not changed." ... but the DagCBOR algorithm is silent on what to do with non-double floating point values? Do you throw an error? Convert per some algorithm? Things like that matter when creating a deterministic canonical form.

We’ve spent a considerable amount of time discussing the pitfalls of indeterministic float representations. The CBOR spec points to IEEE 754 with a few notes. We restrict that considerably more since IEEE 754 has many representations for special values.

From the spec:

Floating point values are always encoded in 64-bit, double-precision form, regardless of whether they can be represented as half (16) or single (32) precision.
IEEE 754 special values NaN, Infinity and -Infinity should not be accepted as they do not appear in the IPLD Data Model. Therefore, tokens 0xf97c00 (Infinity), 0xf97e00 (NaN) and 0xf9fc00 (-Infinity) and their 32-bit and 64-bit variants, should not appear, or be accepted in DAG-CBOR binary form.

@msporny
Copy link
Member Author

msporny commented Jan 20, 2021

If you’re using an array to represent a Set or some other data-structure which needs to represent itself as being unsorted then you need to apply a consistent sort to the array before you serialize it.

Yes, that's exactly the issue. The DagCBOR canonicalization algorithm defined in the DID Core specification does not apply a consistent sort before serialization for anything that is a set (most value ranges in the specification).

From the spec

Hmm, interesting -- first time I'm seeing that document. @jonnycrunch -- why didn't you link to the DagCBOR specification in your PR? Is there a reason you don't reference it? It seems like there have been sections of that spec that have been cut/pasted into the DID Core specification:

https://w3c.github.io/did-core/#dagcbor

There are also rules that are missing... like the one @mikeal mentions above for floats, and restrictions on back-to-back concatenated objects. Has anyone other than @jonnycrunch reviewed the delta between the IPLD DagCBOR algorithm and the one defined in the specification?

https://github.com/ipld/specs/blob/master/block-layer/codecs/dag-cbor.md#strictness

I will also note that it doesn't seem like all the implementations line up on strictness in the algorithms -- is there a proof of correctness or fairly exhaustive conformance test suite that DagCBOR implementations utilize?

https://github.com/ipld/specs/blob/master/block-layer/codecs/dag-cbor.md#implementations

Finally, and I doubt there is an issue here but I have to ask since this is a W3C Working Group and we have a strict Intellectual Property Release process -- who contributed to DagCBOR? Is there a clear history of contributions there? Is there an IPR process around IPFS code repositories? Who wrote those algorithms and would they be willing to sign a W3C Patent and Royalty Free IPR release? I thought @jonnycrunch was contributing his own content, but it looks like IPFS documents have been copied over from the IPFS community and we have to be very careful about IPR and copyright. @mikeal -- what's the IPR policy on the document you pointed to?

@rvagg
Copy link

rvagg commented Jan 20, 2021

I will also note that it doesn't seem like all the implementations line up on strictness in the algorithms

We're getting there. We released the newest JavaScript version that holds to all the strictness requirements (inasmuch as it can given JavaScript's difficulty with float/int differentiation), the main outstanding item in the Go implementation(s) is that it will still deal in NaN, Infinity and -Infinity, but that addition to the strictness criteria was only a recent so it'll get there soon. Rust implementations are slowly converging too. We also have some strong recommendations about floats in particular due to the many difficulties involved in determinism and why it might be best to avoid them entirely or find an alternative, but fixed, representation form (pairs of ints for example) if you really need floating point numbers in your serialized data: https://github.com/ipld/specs/blob/master/data-model-layer/data-model.md#float-kind

For clarity on the point in OP "It also deviates in subtle, but possibly important ways from the WebAuthn specification" (which I think has mostly been resolved by including https://github.com/ipld/specs/blob/master/block-layer/codecs/dag-cbor.md#strictness in the discussion) https://fidoalliance.org/specs/fido-v2.0-ps-20190130/fido-client-to-authenticator-protocol-v2.0-ps-20190130.html#ctap2-canonical-cbor-encoding-form - these rules are a subset of the DAG-CBOR rules, we add a few more on top of this to inch a bit closer to ideal determinism (as well as recommending against floats entirely if possible), so we're well aligned on this since they both mostly emerge out of the original RFC 7049 suggestions.

@mikeal
Copy link

mikeal commented Jan 20, 2021

what's the IPR policy on the document you pointed to

We use a “Permissive License Stack” which is just Apache-2+MIT for licensing and a patent pledge from us you can read about in that post. If there’s any additional concerns you have I can get you in touch with our counsel.

@mikeal
Copy link

mikeal commented Jan 20, 2021

@jonnycrunch -- why didn't you link to the DagCBOR specification in your PR? Is there a reason you don't reference it? It seems like there have been sections of that spec that have been cut/pasted into the DID Core specification:

Keep in mind that we’ve been making steady progress on this spec for a while, as you can see in the commit history, so some of what I may be mentioning now weren’t there the last time @jonnycrunch had a look.

That is not to say the format is not stable. A lot of what has been going into the spec has been strictness we’ve already done in implementations.

It’s also worth noting that, in our primary implementations, we’ve moved from a model in which we’re using a standard CBOR parser and then applying strictness to a new native DagCBOR parser that is designed to parse the DagCBOR strict subset of CBOR. This process has uncovered a few final spec adjustments and it also shows how encoding and parsing our strict subset is actually substantially easier than encoding/parsing all of CBOR. Our libraries for DagCBOR are substantially smaller now that they are designed from the ground up for this strict subset.

@jonnycrunch
Copy link
Contributor

The deterministic ordering rules that I proposed is supposed to be under the more general CBOR section and not dagCBOR.

see my original commit e39d48e as part of PR #420.

I believe that given we are working on a security document, determinism needs addressing in general and not just in dagCBOR.

That said, ordering of a set IS important and will change the signature, but I argue that is the feature that we want and it isn't a bug or a flaw and in fact improves security.

@msporny
Copy link
Member Author

msporny commented Jan 24, 2021

@rvagg wrote:

We're getting there.

To provide some background to @rvagg and @mikeal on where the W3C DID Working Group is with respect to it's charter and lifetime. The specification entered feature-freeze in July 2020. We are getting ready to enter the W3C Candidate Recommendation stage in the next week or so. The W3C Candidate Recommendation stage requires us to be "code complete" -- that is, we are asserting that no technical change that affects implementations will be made to the specification once we enter the CR phase. If we do find a technical change has to be made that affects an implementation, we have to re-do the CR phase. Each iteration takes around 2-3 months on average. We have ~7 months left in the WG before our charter expires.

With the above in play, @jonnycrunch has also objected to putting language into the specification that would allow us to remove application/did+dag+cbor if the group is unable to implement it. That means, if we don't get did+dag+cbor perfect the first time, we're going to be forced to go back into CR.

"We're getting there" is definitely not what we want to hear at this point.

What we want to hear is: "We have at least two finalized implementations of the DagCBOR canonicalization algorithm and a test suite that demonstrates the canonicalization algorithm works -- there are no changes we expect to make to implementations at this point. We are W3C Members and have signed the IPR commitment and are handing copyright and all potential patents over to W3C for the purposes of this specification. We are also volunteering to write all the application/did+dag+cbor tests for the DID WG. We are committing to provide two independent implementations of application/did+dag+cbor."

Instead, what I'm hearing is:

  • The DagCBOR canonicalization algorithm isn't finalized.
  • There are no existing test suites for the DagCBOR canonicalization algorithm demonstrating interoperability between implementations.
  • The copyright and patent commitment to W3C for the DagCBOR canonicalization algorithm is not in place (although, I expect we can address this item with about a months effort -- it takes time to get the lawyers together, they're very busy).
  • No one has volunteered to write the DagCBOR canonicalization tests in the W3C DID WG.
  • No one has volunteered to write all the application/did+dag+cbor tests in the W3C DID WG.
  • Only one person has volunteered to write an implementation for application/did+dag+cbor (we need at least two independent implementations)

The list above is what is expected of every feature in the specification -- finalized algorithms, complete normative language, clear intellectual property rights traceability, demonstration of implementability, demonstration of interoperability, commitment to do the work by multiple people.

Hopefully that gives the folks that want to see this part of the specification happen the list of things that are expected of them by the W3C Process and the DID Working Group.

@msporny
Copy link
Member Author

msporny commented Jan 24, 2021

@mikeal wrote:

We use a “Permissive License Stack” which is just Apache-2+MIT for licensing and a patent pledge from us you can read about in that post. If there’s any additional concerns you have I can get you in touch with our counsel.

I'm a fan of the goals behind the PLS, but unfortunately, due to the non-standard nature of that license stack lawyers will almost certainly have to be involved because it's non standard. The fact is that PL could have patents and owns the copyright on DagCBOR, it's not a W3C Member or a member of the DID WG, so there is no easy path that I know of to reconcile this other than getting the lawyers to talk to each other and that could take months.

/cc @iherman @brentzundel @burnburn for guidance here -- DagCBOR is licensed under the Permissive License Stack, which is a Protocol Labs-specific thing. There may or may not be patents related to it. Text from DagCBOR has been copied into the DID Core specification (I thought it was @jonnycrunch's text, but it turns out it's not entirely).

@msporny
Copy link
Member Author

msporny commented Jan 24, 2021

@jonnycrunch wrote:

That said, ordering of a set IS important and will change the signature, but I argue that is the feature that we want and it isn't a bug or a flaw and in fact improves security.

It is a flaw.

The ordered sets that we use for many map entry values in the specification do not allow duplication... implementers are expected to use Set primitives in implementation languages instead of Arrays. Not all language Set primitives guarantee the preservation of insertion order and/or iteration over the Map. That, coupled with the canonicalization language for DagCBOR is going to result in non-determinism that will surprise implementers and harm interoperability.

@msporny
Copy link
Member Author

msporny commented Jan 24, 2021

@jonnycrunch wrote:

The deterministic ordering rules that I proposed is supposed to be under the more general CBOR section and not dagCBOR.

To be clear, that would mean that these rules would apply to every CBOR representation of DID Documents:

  • Property names MUST be represented as text string (major type 3) and contain only UTF-8 strings.
  • Undefined Values of Required Properties as defined in the Data Model that are absent from the CBOR representation SHOULD be labeled with Primitive type (major type 7) with value 23 (Undefined value).
  • The keys in every map must be sorted lowest value to highest. Sorting is performed on the bytes of the representation of the keys. If two keys have different lengths, the shorter one sorts earlier. If two keys have the same length, the one with the lower value in (byte-wise) lexical order sorts earlier.

Each of those rules would prevent DID Document CBOR formats that wanted to:

  • Use integers as keys.
  • Canonicalize on something other than sorting map keys by string byte order.

That would certainly make something like a compact DID Document format in CBOR (using small integers to represent property names and some value classes) and CBOR-LD illegal.

I was puzzled by where it was in the specification and asked you if that was the intent during a call. You said that it was really meant to be more about DagCBOR canonicalization, so I moved it. If you had not said that, I was going to object to the text for the reasons above -- it closes the door on other types of canonicalization innovation.

@msporny msporny added the pr exists There is an open PR to address this issue label Jan 24, 2021
@iherman
Copy link
Member

iherman commented Jan 25, 2021

/cc @iherman @brentzundel @burnburn for guidance here -- DagCBOR is licensed under the Permissive License Stack, which is a Protocol Labs-specific thing. There may or may not be patents related to it. Text from DagCBOR has been copied into the DID Core specification (I thought it was @jonnycrunch's text, but it turns out it's not entirely).

Only reacting on the administrative steps: if the text which was put into the spec was indeed not the text of @jonnycrunch (with the attached IPR licensing of his company) but a text of @mikeal, then, I presume, @mikeal (or whoever holds the rights) will have to sign a licensing agreement, similarly to what is done when a substantive PR is submitted to the specification by a non-Working Group participant. Otherwise, the relevant text must be removed from the specification.

Cc @wseltzer (she can tell us how to do that administratively)

@jbenet
Copy link

jbenet commented Jan 25, 2021

hey @msporny 👋 😄

I'm a fan of the goals behind the PLS, but unfortunately, due to the non-standard nature of that license stack lawyers will almost certainly have to be involved because it's non standard. The fact is that PL could have patents and owns the copyright on DagCBOR, it's not a W3C Member or a member of the DID WG, so there is no easy path that I know of to reconcile this other than getting the lawyers to talk to each other and that could take months.

The licenses are not "non-standard". It's simply Apache2 + MIT Dual License. This is common. Use whichever license you prefer (guessing Apache2). Any potential patents are covered by the Apache2 patent clause. (we do not have any patents and do not plan on getting any -- if you need us to do so and assign it, we can look into it).

We are also happy to provide other licenses as you request them, or sign IPR licensing you and @iherman request. Point us to whatever you need us to do.

The mentioned Default Open Pledge (the non-standard part of the PLS) is not relevant here -- that's just a pledge that new/future software we write is by default licensed openly. It's a commitment to license future things. here, the interest is in existing things that are already licensed, and can be licensed to W3C as you need them to be.

@jonnycrunch
Copy link
Contributor

Again, to be clear: the deterministic canonical ordering language I added to CBOR comes from IETF 7049 which give spec authors some guidance, but it is up to spec to protocols to define what they mean by a canonical format. The fact that the IPLD spec also uses similar language as IETF 7049 should come as no surprise as that is how we get on the same page regarding a canonical deterministic representation.

I welcome any push back making sure non-working group contribution is appropriately attributed. However, in my original contribution I was careful to only reference "dagCBOR" as a type of CBOR that has additional constraints (mainly tag 42 in the IANA registry).

@msporny
Copy link
Member Author

msporny commented Jan 25, 2021

@jonnycrunch wrote:

Again, to be clear: the deterministic canonical ordering language I added to CBOR comes from IETF 7049 which give spec authors some guidance, but it is up to spec to protocols to define what they mean by a canonical format.

Here is the actual text you added:

https://github.com/w3c/did-core/pull/282/files#diff-0eb547304658805aad788d320f10bf1f292797b5e6d745a3bf617584da017051R2442-R2445

Then you modified it to this text (which added things):

https://github.com/w3c/did-core/pull/420/files#diff-0eb547304658805aad788d320f10bf1f292797b5e6d745a3bf617584da017051R2845-R2850

Copy-pasted your final addition for convenience:

  • Property names MUST be represented as text string (major type 3) and contain only UTF-8 strings. <-- this is NOT from RFC7049
  • Undefined Values of Required Properties as defined in the Data Model that are absent from the CBOR representation SHOULD be labeled with Primitive type (major type 7) with value 23 (Undefined value). <-- this is NOT from RFC7049
  • Property names in each CBOR map MUST be unique. <-- this is NOT from RFC7049
  • Integer encoding MUST be as short as possible. <-- this is from RFC7049, but leaves out really important details from RFC7049
  • The expression of lengths in CBOR major types 2 through 5 MUST be as short as possible. <-- this is from RFC7049, but leaves out really important details from RFC7049
  • The keys in every map must be sorted lowest value to highest. Sorting is performed on the bytes of the representation of the keys. If two keys have different lengths, the shorter one sorts earlier. If two keys have the same length, the one with the lower value in (byte-wise) lexical order sorts earlier. <-- this is from RFC7049
  • In short, you copied some of it, modified some of it, and left out some of it -- changing how implementers woudl interpret the rules into something that wasn't RFC7049, and something that isn't DagCBOR. It's a new canonicalization scheme -- and again, as this issue is highlighting, the scheme is incomplete, does not work, and needs to be fixed in order to be implementable and testable during the W3C Candidate Recommendation phase.

    However, in my original contribution I was careful to only reference "dagCBOR" as a type of CBOR that has additional constraints (mainly tag 42 in the IANA registry).

    For a global standard, you have to either point to a stable specification that has undergone proper Working Group review (as determined by the DID WG and, ultimately, W3C Membership), or specify those additional constraints completely so that the algorithm is implementable by developers that were follow the instructions in the DagCBOR section such that when multiple implementers read the language, and implement what they're reading, they all end up with interoperable implementations that are objectively testable.

    I was under the impression that you were going to define the DagCBOR canonicalization algorithm in the specification in some way in order to make it normative. You can't just mention it in passing and point to a github repository because that repository could change at any moment and global standards are expected to be stable (for decades) once they're published.

    I'm struggling to understand how exactly the DagCBOR section is going to be implementable based on the specification text that exists in that section today?

    Potential paths forward that I see:

    1. Define the DagCBOR canonicalization algorithm in a way that is complete and normative in the DID Core specification.
    2. Define the DagCBOR canonicalization algorithm in a separate specification published by a Working Group that is recognized by IETF or W3C has done the work to demonstrate it's correctness and implementability and then reference that specification from the DID Core specification.

    I assumed you were going for 1 above... since it's too late to do 2 at this point.

    Can you please explain how implementers are supposed to implement what's in the DagCBOR part of specification today, @jonnycrunch?

    @jonnycrunch
    Copy link
    Contributor

    jonnycrunch commented Jan 25, 2021

    Again, to be clear: the deterministic canonical ordering language I added to CBOR comes from IETF 7049 which give spec authors some guidance, but it is up to spec to protocols to define what they mean by a canonical format.

    Yep, glad you finally read and understand what I wrote! This is NOT necessarily dagCBOR, it is a deterministic canonical constraints (not necessarily algorithm) for our specification that belongs in the CBOR section and I welcome improvements to make it complete and compatible with other CBOR sub-representations like CBOR-LD and dagCBOR.

    @msporny msporny changed the title Security flaws in DagCBOR? application/did+dag+cbor not implementable? Jan 26, 2021
    @msporny
    Copy link
    Member Author

    msporny commented Jan 26, 2021

    @jonnycrunch wrote:

    Yep, glad you finally read and understand what I wrote!

    Let me try and summarize my understanding of your position wrt. application/did+cbor and application/did+dag+cbor based on this thread of discussion:

    1. You would like a deterministic canonical constraints in the application/did+cbor section that is different from RFC7049 and different from DagCBOR?
    2. For the application/did+cbor section, the new constraints are provided solely by you, with the rest being partially copy-pasted from RFC7049?
    3. For the application/did+cbor section, you do not wish to specify the normative algorithm of producing or consuming such a deterministic canonical serialization?
    4. For the application/did+dag+cbor section, you do not wish to use the DagCBOR canonicalization algorithm specified in the DagCBOR specification nor do you wish to specify any normative algorithm of producing or consuming such a deterministic canonical serialization?

    Is the above a correct understanding of your current position?

    @jonnycrunch
    Copy link
    Contributor

    For TranSendX:

    Are you planning to implement the DagCBOR representation format?

    Yes

    Are you committing to provide a test suite that tests the effectiveness of the DagCBOR canonicalization algorithm?

    Yes, and it will be beautiful, nobody will have seen a more beautiful suite of tests!

    Would you object to DagCBOR being moved into an extension specification and registered in the DID Specification Registries as an extension?

    Yes

    First and foremost, we don't have a clear path to implementing the extension specification and/or governance of the DID-spec registries. Second, I am committed to making dagCBOR a first class citizen in the specification and towards that end making a deterministic canonical CBOR representation work and therefore our spec more secure.

    @csuwildcat
    Copy link
    Contributor

    With IPFS being such a critical component for so much of the decentralized app world, I would love to see harmony between DID Core and IPFS structures/protocols. I think we're missing an opportunity if we don't make that happen, if at all feasible. I also don't understand the licensing issues brought up - if it's dual MIT/Apache 2, there should not be a legal issue, and let me tell you, we have some rather **al lawyers on our end.

    @OR13
    Copy link
    Contributor

    OR13 commented Jan 28, 2021

    First and foremost, we don't have a clear path to implementing the extension specification and/or governance of the DID-spec registries. Second, I am committed to making dagCBOR a first class citizen in the specification and towards that end making a deterministic canonical CBOR representation work and therefore our spec more secure.

    having did+dag+cbor be "the example" of this would really help...

    dag cbor will never be "the canonical" representation for the ADM... because the ADM was designed specifically to allow representation in did+cbor, did+dag+cbor and did+ld+cbor, did+json, did+ld+json, did+indy+json.... they are all the same level of citizen... and they each might have a canonical representations....

    I feel like 1 person saying they are gonna solo support for a representation is a really bad sign.

    There should be a group of people who can be held accountable, and who have the time and skill needed to provide an excellent experience.

    to be clear, I think did+dag+cbor is implementable, and I really, really wish the effort being put into it was going into the registry process for all did method types... instead of just one.

    This thread is an example of the can of worms we opened when we created unbounded support for representations...

    We are not fixing this issue by scrambling to "put everything in did core"... in fact, thats very unhelpful, and is making things worse.

    instead we should be shipping support for some simple representations in the first version of did core, and establishing a clear registration process with good examples like did+dag+cbor for more experimental formats.

    The most helpful thing that could be done with did+dag+cbor would be to use it as the shiny example of how to register things in a way that isn't blocked by making changes to a spec that is going to be frozen in time...

    @jonnycrunch
    Copy link
    Contributor

    I feel like 1 person saying they are gonna solo support for a representation is a really bad sign.

    There should be a group of people who can be held accountable, and who have the time and skill needed to provide an excellent experience.

    @jbenet I'd like to think that I am not alone in wanting to see DID documents represented as dagCBOR.

    having did+dag+cbor be "the example" of this would really help...

    @OR13 Absent any additional help, I'm willing to work with you to make this work in the did-spec-registries

    @msporny msporny added the needs special call Needs a special topic call to make progress label Jan 28, 2021
    @ChristopherA
    Copy link
    Contributor

    As an FYI, Blockchain Commons is making more and more use of CBOR for cryptographic data formats (see https://github.com/BlockchainCommons/Research/blob/master/papers/bcr-2020-005-ur.md and other UR research in that repo), and maybe we will eventually including DID and VCs in that CBOR research. However, we are uncomfortable with DagCBOR — feels like an encoding standard inside an encoding standard. We are much more interested in CBOR-LD if that eventually emerges.

    Unfortunately we can't make a commitment to support CBOR for DIDs at this time — we could participate if there are others or with sufficient research funding, but if we do it at all we'll likely do it in a non-conformant fashion after DID 1.0, as we don't need the same kind of broad interoperability that the LESS Identity folk need.

    -- Christopher Allen, Principal Architect & Executive Director, Blockchain Commons

    @jonnycrunch
    Copy link
    Contributor

    @ChristopherA thanks for sharing, interesting work! Do you see the value for the deterministically encoded CBOR in your work and in the DID spec in general ?

    @ChristopherA
    Copy link
    Contributor

    @jonnycrunch, Our requirements are mostly driven by the need to negotiate security between devices through use of AirGap QRs (and high latency low bandwidth TorGap).

    @jonnycrunch
    Copy link
    Contributor

    jonnycrunch commented Jan 28, 2021

    I'd like to point out that there is an update to the CBOR spec namely #rfc8949
    and perhaps the compromise is to normatively point to the Core Deterministic Encoding Requirements, which seems to satisfy everything that I was attempting to cover.

    Realize of course that my original contribution to this section is now >3 months and a normative encoding requirement now exists.

    @mikeal
    Copy link

    mikeal commented Jan 28, 2021

    I'd like to think that I am not alone in wanting to see DID documents represented as dagCBOR.

    we would very much like to see this happen :)

    many things are not useful to us in decentralized systems because they aren’t fully represented by content addressed data structures. these become significantly more useful once this representation exists, there’s a big limit on what we can do with DID’s beyond treating them like strings if we don’t have it.

    it’s not that there won’t be any integrations, but it’s always going to be “an integration” rather than something builtin if we don’t have a sufficient representation we can put in the data structures.

    @dlongley
    Copy link
    Contributor

    dlongley commented Jan 28, 2021

    I'm also in favor of seeing DID Documents represented in dagCBOR/bringing DIDs to the IPFS ecosystem. However, we need to be careful that we don't fail to get a DID core spec (!) finished because the timelines aren't compatible. I think we should decouple these things to allow DID Docs in dagCBOR to reach the maturity required without jeopardizing both efforts.

    @jbenet
    Copy link

    jbenet commented Jan 28, 2021

    @OR13

    There should be a group of people who can be held accountable, and who have the time and skill needed to provide an excellent experience.

    @jonnycrunch

    @jbenet I'd like to think that I am not alone in wanting to see DID documents represented as dagCBOR.

    • (preamble: please forgive my context gaps, i'm coming in and out of this thread while juggling many other things)
    • Yeah, we also want this, 💯 , and many applications we work with mixing DIDs, IPLD, and IPFS need this.
    • What do you need specifically from another group? Do you need support maintaining the one impl, or do you need another?
    • PL can commit to fund the work of people w/ the skills in other teams, who can take on this maintenance. Are there other groups who are interested, but limited by funding? (we can do a grant or contract to satisfy the DID WG's expectations)
    • PL may be able to commit the time of people w/ the skills in our own team, but we need to do some scoping first to understand the commitment better. (give us a couple days to discuss)
    • I think other co-s need this (like Ceramic, 3box, Textile, Fleek, MSFT ION) and may also be able to commit some time, but not sure.
    • @michaelsena are you guys interested in this? able to take this on? i'm reminded of the dag-jose work, but you might be too busy this Qtr

    because the timelines aren't compatible.

    @dlongley can you point us to the timeline expectations? (we can move fast to find/fund a group, or help directly if it happens to line up w/ our constraints)

    @mikeal
    Copy link

    mikeal commented Jan 28, 2021

    BTW, I should mention that we’re working W3C membership right now.

    So we should be able to clear up any IPR concerns you have w/ DagCBOR.

    @mikeal
    Copy link

    mikeal commented Jan 28, 2021

    Speaking of Textile, I’m sure @carsonfarmer would like to see this happen as well :)

    @jbenet
    Copy link

    jbenet commented Jan 28, 2021

    @OR13 @jonnycrunch

    having did+dag+cbor be "the example" of this would really help...

    @OR13 Absent any additional help, I'm willing to work with you to make this work in the did-spec-registries

    • i also think this can be a good solution IFF it works.
    • (everyone here knows this but obligatory warning) beware that tons of specs start with the intention to take in upgrades/evolve later, and most of those fall short of intention. protocol stacks tend to ossify painfully fast 🙄

    @carsonfarmer
    Copy link

    carsonfarmer commented Jan 28, 2021

    Thanks for pulling me in here @mikeal, yes @textileio and @carsonfarmer are indeed interested in seeing this happen as well. In particular, we are interested in working with @oed and @michaelsena re: our joint dag-jose work/grant. This proposal, coupled with some of the JOSE work our teams have been collaborating on open up to door to MUCH broader use of these standards in the ETH, IPFS, and other DWeb communities. This is 100% a win win situation if we can make this happen.

    @OR13
    Copy link
    Contributor

    OR13 commented Jan 29, 2021

    I propose we do the following:

    1. remove did+dag+cbor from did core.
    2. add did+dag+cbor to did spec registries and point it to a "representation spec"
    3. create the did+dag+cbor "representation spec" in some neutral territory and assign code owners (I propose DIF).

    We can then continue to define how did+dag+cbor will function without holding up the did core wg, or trying to rush normative references into the cbor section of did core which as seen essentially 0 contribution aside from @jonnycrunch .

    I know there is a developer community behind IPFS, I am part of it... developers don't care about standards, I get that... but many of us have customers that do care about standards compliance.... who know the difference between an open standard, and an MIT / Apache 2.0 codebase on github.

    We need to make sure we don't alienate the folks that want to see IPFS become an Open Standard, such work will take longer than DID Core has... which means we need to be respectful of that and plan accordingly.

    @iherman
    Copy link
    Member

    iherman commented Jan 29, 2021

    The issue was discussed in a meeting on 2021-01-28

    List of resolutions:

    • Resolution No. 1: The DID Working Group will not define a canonical form for the Abstract Data Model.
    View the transcript

    2. CBOR sections

    See github issue #585, #551.

    See github pull request #552.

    Manu Sporny: Let's talk about the CBOR section and the DagCBOR section. Jonathan, can you give an overview on those sections now?

    Jonathan Holt: On our call on Tuesday, we're working on a security document. We need to have deterministic encoding of the DID document, especially if the method will be signing and having a deterministic ordering is important.
    … My DID method relies on that and is on dagCBOR. The deterministic encoding of dagCBOR is really important.
    … I mentioned, I'm by no means a CBOR expert, I do have a lot of reliance on deterministically encoded CBOR in our method. The challenges of writing the requirement are the RFC 7049 gives some guidance but it's up to protocols to clearly state what they mean to be canonical or deterministic. I gave some guidelines. At first I was over zealous to add every possible combination.

    Jonathan Holt: #586

    Jonathan Holt: Including 64-bit integers and floats, but the language that's now in the dagCBOR section, but should be in the CBOR section. So here's a new PR to fix it.
    … To get us onto a fresh discussion on deterministically encoded CBOR. So we can dice out, what does that mean, is it canonical, is it deterministic, we can decide how to move forward.

    Manu Sporny: Thanks for that overview, Jonathan. There are numerous concerns around deterministic canonical form for CBOR. Just so everyone is on the same page for deterministic canonical form. Typically when you digitally sign things you want to have them in a deterministic canonical form.
    … There are other ways to do signing where you, for example, base64 encode anything and just sign that. The issue with that, however, is that if there are any blank space/white space/formatting changes that changes the signature. There are other technologies that use canonicalization with JSON-LD/JCS, and so on.
    … When we say deterministic canonicalization form we are transforming the input in some way to ensure that the output is always the same (there's one way to express it) regardless of the input.
    … With that being said, I personally don't think the group is chartered to do canonical forms and new signature formats and things of that nature. It could be argued that the group is supposed to do things of that nature. I would note we have 2 weeks left and this is something we should have figured this out well before it. So, concerns are, are we chartered to do this work and do we have time.
    … The other concern is the current canonical form in the spec makes it impossible for other CBOR flavors to exist; the current mechanism applies to all CBOR encodings. So, for example, any CBOR flavors that don't use strings for properties are illegal. To be fair, Jonathan has said that isn't his intent and he doesn't want that to happen. But fundamentally that's squaring a circle.
    … Defining something for all of CBOR without limiting what certain flavors can do is a problem. We'd be in a greenfield exercise. The last item is that the dagCBOR section, which does require canonical form.
    … The dagCBOR spec is an external spec by Protocol Labs and that text there needs to be the minimum necessary without relying on an external spec.
    … These are the issues we're grappling with in the spec in its current form.

    Jonathan Holt: I think the digital signatures are not in scope for the charter. I agree with that. Data modeling is. How we get to data modeling to ordering is relevant for us to sign.
    … Regarding the reference to external libraries. I think that should be resolved pretty quickly. The reference in the canonical ordering borrows from RFC 7049, and the updated one RFC 8949.
    … There should be no surprise that there's language on deterministic ordering to ensure we're all doing the same thing.

    Orie Steele: I think I agree with most of what jonathan said. We have an ADM and serializations of that ADM in various different forms. If we're limiting ourselves to just JSON forms, there are multiples in JSON alone and the same applies to CBOR. Thinking of a canonical representation of an ADM, I'd like to dispel the idea that that is possible. I don't believe it is. If it were, we'd have a holy war and all the representations would fight to "be it".
    … We moved away from that -- and it feels like people don't know what the ADM has done. There's an ADM and serializations, each with a mimetype, there are an unbounded number of these, an undefined process for adding more of these, and no one has done the work to define these things.

    Jonathan Holt: +1 to Orie, thanks!

    Justin Richer: You don't sign the ADM. You sign the representation. Signatures need to define how to sign the representation. This isn't our fight.

    Manu Sporny: +1 to justin_r

    Orie Steele: The people who proposed the ADM never finished the work to solve the registration problem and now jonathan is encountering that. It should be trivial to register the mime type, we should say, here's where you reference the external spec that makes it trivial to implement, and this should not be hard. There's tension over what goes in DID core and what goes in the registries. DID core will get frozen, and you should put things you're

    Dave Longley: still working on in the registries and you can update it when you want.

    Justin Richer: Representations need to define how to get in and out. Sometimes that has inherent ordering (like a JSON array for an infra set)

    Orie Steele: If you can't create a new mime type after DID core is done then the ADM was a mistake.

    Justin Richer: -1 to canonicalization of the ADM

    Justin Richer: (without hearing the actual argument)

    Orie Steele: -1 to canonicalization of the ADM

    Brent Zundel: -1 to ADM canonicalization

    Manu Sporny: To propose two questions: Do we want to specify a canonical form/rules for the ADM or the information model. I expect everyone to say no to that, no one signed up to do that.
    … Second question is: Do we want to specify a canonical form for any of the representations? Does this WG want to do that and will we all pitch in to do that work? Then the question becomes do we think we can solve that problem in two weeks? We are supposed to be frozen in two weeks.

    Justin Richer: -1 to specifyng canonical forms for representations

    Manu Sporny: -1 to canonicalization of the ADM

    Jonathan Holt: I don't think canonicalization of the ADM makes sense to me, but certainly what you're signing is a representation in a particular format. Getting a one way in and one way out -- as suggested by the RFC ... our protocol should say how to sign the CBOR and get into a particular format.

    Manu Sporny: -1 to specifying canonical forms for representations in DID Core

    Manu Sporny: (DID Representations outside of DID Core can do whatever they want)

    Orie Steele: I would propose that "did+dag+cbor" support should be out of scope.... in the same way that "did+xml" should be out of scope... based on where we are now.

    Justin Richer: signature methods specify canonicalization of data structures if they need it

    Jonathan Holt: Also from the perspective of the order here, the conversations we had ... ordering matters and it matters for signatures, but what I didn't highlight -- is that it's up to the DID Doc producer to put it in the right order.
    … The way the author puts them in order doesn't matter, but it matters what order it is in when signing it.

    Justin Richer: I agree w/jonathan, the order matters WHEN SIGNING -- which is why it's up to the signature method to specify, including verification. Maybe I need to keep the original byte stream to validate, like with JWS. Maybe I don't.

    Dave Longley: in response ot manu's question, -1 for canonicalising the ADM, I don't understand what that would mean
    … -1 to specifying canonical forms for representations
    … if a given representation wants to do that they can do so but I don't think there's support.. this WG would be fine with someone doing that but I don't think we could get it done into DID core
    … a lot of these problems could be solved by addressing the problem Orie mentioned

    Orie Steele: agree, -1 to specifying canonical forms of JSON, CBOR... not what I signed up for

    Dave Longley: getting text in the spec that says here is how you can add more representations, and into the registries
    … that gets these issues out of the way of DID core
    … I support being able to put DID docs in dagCBOR
    … it's great that jonathan is working on that. I have no idea if the timelines are going ot match up with what we need to freeze for DID Core
    … I want to make sure we don't lose the entire DID Core spec because of this extra piece that I am in support of
    … It would be great if DID docs were in ipfs
    … I'd be interested in using that
    … but we need to be careful that we don't end up losing the entire DID Core spec because of these issues
    … if we can find a way to enable people to continue their work and make sure other people can create representations and help the DID ecosystem flourish we should do that

    Dave Longley: The other thing I wanted to say that I forgot ... was that we must not block other flavors of CBOR in the future.

    Markus Sabadello: Moving the other representations in to the DID spec registries -- I wanted to do that, would that be ok with jonathan? We have registered properties, parameters, DID methods, so on. If we have a process for representations, that would be ok with that.

    Jonathan Holt: If we can flush out the governance, I may be ok with that. It's just dangling out there right now.

    Orie Steele: agree with markus... we could fix registration of new representations... and it will suck if we don't.

    Ivan Herman: Just for my understanding, as far as I understood, the only reason we're talking about canonicalization here, is for the purpose of signature. If that is the case, and we're not defining signature for the time being. We don't say how you would sign the JSON representation, and if we don't talk about signature, then there is no reason to have canonicalization in the document.
    … I wanted to have that on record, someone also said this in IRC.

    Jonathan Holt: lossless encoding/decoding

    Manu Sporny: Seeing some of the feedback in IRC and where the discussion seems to be headed. Two proposals I'd like to emote in IRC to look at before we take them up.

    Orie Steele: The second part of Manu's proposal isn't clear enough to me, if we can be clear about the registration process and that representations are free to define canonical forms, etc. that would help.

    Manu Sporny: I think everyone wants the process to be more detailed. I thought we agreed to not put registration processes in DID core. Because those are hard to change. I thought consensus was that the registration processes would go in the DID registries document. I'd be fine with specifying how to define representations in that doc, doing it in DID core would be a problem.

    Dave Longley: I think DID core just needs to say you can define more representations via the registries process (see doc over here).

    Drummond Reed: I think if we want to put the process over in the registries, I think that's what we want to do. I totally agree that we need to document it and I want to help work on it and that's where it belongs.

    Ivan Herman: +1 to drummond, that is what W3C would probably go towards

    Drummond Reed: I agree with what Dave Longley just said that DID core just needs to say go look at the registries doc for the process.

    Orie Steele: I recall -- Drummond and Manu are correct that the consensus is that the DID spec registries would define the process and that's where the work needs to get done. And it just hasn't happened. And so that's why it's hard to see how it will work.
    … I think this would help address jonathan's concerns.
    … With DID core defining JSON/JSON-LD/etc. then there's a feeling of second class citizenry. I don't know how to address that feeling, but it seems impractical to address that but stuffing everything into the registries.

    Jonathan Holt: I think I can defer to Mike Jones and Justin Richer on this. Unlike JSON which isn't as strict, CBOR facilitates more strictness in the RFCs to facilitate this problem with base64 encoding with JWT for instance. It's natively supported in the RFC. It's specified that protocols should consider deterministic encoding of the representation.
    … Unlike in other representations.

    Orie Steele: imo, this issue has almost nothing to do with canonicalization or cbor... we could be talking about yaml or xml... same problem.

    Jonathan Holt: I'm also reading that other RFCs such as for COSE, the RFC punts that back up to CBOR RFC 7049 and the updated one. It's saying why it's a bad idea ... it battles with the JOSE spec. There's a lot of language, and I wish I had expertise as Jim Schaad, and Carsten, to get some weigh in for the implications of not addressing this right now.

    Michael Jones: With respect to COSE, because there isn't a standard canonical CBOR, is what COSE does, when it wants to sign something it just puts it in a binary string and encapsulates it. It's kind of the equivalent of what JOSE with base64. COSE side steps this by representing it as a binary string.

    Manu Sporny: I'm going to put in the poll, but before doing that. Just real quick. On the CBOR language, that is being referred to. It does not guarantee a canonical form. It was never meant to be that -- that's why it says "These are things you might want to keep in mind". It says "If you want a canonical form, you might want to try and do at least these things" But it's up to other specs to do that and as Mike says other specs just print out a binary string and sign it.
    … I'm thinking that you think that text does more than it does. It only applies to binary serializations that are deserialized -- it does not apply to production and consumption and round trips through ADMs. It just says if you have a binary string and you want two implementations to pull in and understand in the same way -- that's all they got to.

    Jonathan Holt: It is possible, and I'd like to tease out, what parts of it do you have problems with. I'd love to address those concerns.

    Proposed resolution: The DID Working Group will not define a canonical form for the Abstract Data Model. (Manu Sporny)

    Orie Steele: +1

    Brent Zundel: +1

    Manu Sporny: +1

    Ivan Herman: 1

    Dave Longley: +1

    Amy Guy: +1

    Jonathan Holt: +1

    Drummond Reed: +1

    Shigeya Suzuki: +1

    Markus Sabadello: +1

    Ted Thibodeau Jr.: +1

    Resolution #1: The DID Working Group will not define a canonical form for the Abstract Data Model.

    Michael Jones: +1

    Adrian Gropper: +1

    Jonathan Holt: You know I'm going to object. I'm really harping on this canonicalization, it makes it so much easier if we have a canonical representation in CBOR. I think the ADM, it's just too abstract. So having a concise binary object representation helps facilitate the lossless encoding and decoding into other formats. It behooves us to tackle this, as it opens the door.
    … Carsten has a lot of libraries for going back and forth because CBOR is so extensible.

    Ivan Herman: I could say similar things about other formats. The reasons why I started work on doing various types of constraint languages, e.g., for json schema and for JSON-LD -- and I've put them into the registry repo right now... part of that to be discussed. Having your work put there would be what I would expect to happen. That can be done one the CR is published because this is not something absolutely necessary to go ahead with the CR.
    … Yes, I believe you, I'm not an expert, it's very useful for testing, implementations, all kinds of different things, yes we should produce these things for the various representations we put into core, but it doesn't have to be in core itself.

    Brent Zundel: I'm getting pretty concerned that we're getting close to things that are officially out of scope for our group. It could argued that explaining a deterministic algorithm for signatures could be out of scope because it's too close to signatures. If we're not past the point of our scope we're very close to it.

    Proposed resolution: The DID Working Group will not define a canonical form for any representation in DID Core. Representations that want to define a canonical form as a DID Specification Registry extension are free to do so. The DID WG will define the registration process in the DID Specification Registries and provide an example of at least one registration in DID Specification Registries. (Manu Sporny)

    Orie Steele: +1

    Dave Longley: +1

    Drummond Reed: +1

    Manu Sporny: +1

    Adrian Gropper: +1

    Brent Zundel: +1

    Amy Guy: +1

    Ted Thibodeau Jr.: +1

    Shigeya Suzuki: +1

    Michael Jones: +1

    Jonathan Holt: -.5

    Ivan Herman: I have a question on the proposal. I thought what we'd do in the registries, is not only the canonical forms, but also any kind of additional representations.

    Manu Sporny: Yes, that is correct, do you feel that the proposal doesn't say that?
    … We could put all the representations in the registries to address the second class citizen concern.

    Ted Thibodeau Jr.: Representations, with or without a canonical form, may be added to the DID Specification Registry as extensions.

    Ivan Herman: If I want to have a yaml representation of the model, I should be able to do that in the registry. That, for me, is not clearly in the proposal.

    Manu Sporny: Yes. That's the intent.

    Jonathan Holt: How about this compromise, only the core model is in the DID core spec, and the representations are all in the registries.

    Ted Thibodeau Jr.: I think jonathan, that's roughly the intent at this time. Part of the pushback against you right now is that you have acknowledged that you're not an expert on the thing you want in the spec and we're up against tight timelines right now. Without the expertise to write the PR for what you want to add, I don't see that as possible.
    … I think we'll put all the representations in the registry -- and it doesn't say anything about any representations being in DID core right now.

    Manu Sporny: I think we should take up another proposal to clarify what's going on.

    Ivan Herman: +1

    Michael Jones: This talk of all the representations being in the registry doesn't match what we've actually done in the spec. The JSON and JSON-LD and the dagCBOR representations are all defined in the core spec, not in any registries. I propose we don't change that and don't make any resolutions so it appears that's not true.

    Manu Sporny: I would like the group to focus on getting one proposal passed at a time.
    … Does anyone have a proposal they'd like to float for moving all the representations into the registries? I will float one if not.

    Jonathan Holt: hence, second class

    Amy Guy: +1 selfissued

    Michael Jones: Yeah, specifications are specifications and registries are registries. Registries are lists of things. Specs have normative text. Talking about moving large blocks of text into a registry is nonsensical.

    Brent Zundel: +1 to selfissued

    Orie Steele: +1 to =selfissued

    Dave Longley: I put a proposal in IRC. Can we solve the second class citizen issue by being clear in the core spec
    … by saying there are representations in this spec, but they are not any more important than any other representation in the registries, go look there for others, which will have links to the specs where those are defined

    Drummond Reed: +1 to Dave's suggestion

    Amy Guy: +1 to dlongley too... the only difference between them is that some were ready in time for CR and others were ready later

    Manu Sporny: I don't think that would address the issue, Dave. But let's run proposals.

    Drummond Reed: +1 to Amy

    Proposed resolution: Move the existing representations in the DID Core specification into their own specifications and register each representation in the DID Specification Registries. (Manu Sporny)

    Michael Jones: -1

    Amy Guy: -1

    Drummond Reed: -1

    Brent Zundel: -1

    Dave Longley: -1

    Manu Sporny: -1

    Orie Steele: -1

    Ivan Herman: -1

    Jonathan Holt: +1

    Markus Sabadello: -1

    Adrian Gropper: -1

    Brent Zundel: brent: It is clear from this result that we do not need to run the counter proposal

    Manu Sporny: Do we need to run the opposite proposal? Where we say we're going to keep the core representations in the spec?

    Brent Zundel: brent: the representations in the spec will stay there

    Proposed resolution: The DID Working Group will not define a canonical form for any representation in DID Core. Representations, with or without a canonical form, may be added to the DID Specification Registry as extensions. The DID WG will define the registration process in the DID Specification Registries and provide an example of at least one registration in DID Specification Registries. (Manu Sporny)

    Ivan Herman: +1

    Manu Sporny: +1

    Orie Steele: +1

    Drummond Reed: +1

    Amy Guy: +1

    Shigeya Suzuki: +1

    Adrian Gropper: +1

    Ted Thibodeau Jr.: +1

    Markus Sabadello: +1

    Dave Longley: +1

    Michael Jones: -1

    Jonathan Holt: -1

    Brent Zundel: +1

    Amy Guy: +1 selfissued that's how I interpreted this

    Michael Jones: This is very strangely worded. You make a representation in a specification. You might also list that specification in a registry. You don't add a representation directly to a registry. A registry is a list not a spec.

    Proposed resolution: The DID Working Group will not define a canonical form for any representation in DID Core. Representations, with or without a canonical form, may be registered in the DID Specification Registry as extensions. The DID WG will define the registration process in the DID Specification Registries and provide an example of at least one registration in DID Specification Registries. (Manu Sporny)

    Manu Sporny: +

    Ivan Herman: +1

    Manu Sporny: +1

    Drummond Reed: +1

    Ted Thibodeau Jr.: +1

    Dave Longley: +1

    Amy Guy: +1

    Brent Zundel: +1

    Michael Jones: +1

    Shigeya Suzuki: +1

    Orie Steele: +1

    Brent Zundel: acl jonathan_holt

    Jonathan Holt: I haven't seen this in any protocol/place where some representation isn't able to handle this, the deterministic section in CBOR says it's up to authors. We are supposed to clearly state how to handle a representation. Not kicking the can down the road into some registry process.

    Ted Thibodeau Jr.: there's no protocol here

    Drummond Reed: My understanding is that everything that is defined in DID core is listed in the registry. Everything in the registry is official. It doesn't really matter whether a representation is in DID core or outside of DID core. All are siblings, all are in the registry.

    Manu Sporny: That's correct.

    Ivan Herman: That's correct.

    Dave Longley: That's correct.

    Jonathan Holt: -0.5

    Amy Guy: +100 to fleshing out governance of registries!

    Jonathan Holt: It's a fair compromise, I think we need to flush out the governance of the registry -- in which case it will be seamless, but it's punting it and I don't like that.

    Orie Steele: yes please, help by making PRs!

    Brent Zundel: Thanks for coming, thanks to scribe, thanks for the input.

    Drummond Reed: Thanks Dave for scribing


    @iherman
    Copy link
    Member

    iherman commented Jan 29, 2021

    @OR13

    1. create the did+dag+cbor "representation spec" in some neutral territory and assign code owners (I propose DIF).

    that can be done, but we can also decide to publish a Working Group Note on this representation spec. A note can be published at any time by this WG before its charter expires.

    @dlongley
    Copy link
    Contributor

    dlongley commented Jan 29, 2021

    @jbenet,

    We'll be entering CR in a week or two which is how unrealistically tight the timeline is here. See @msporny's comment. I don't think there's any need to try and squeeze things in before then as using the DID spec registries gets us same desirable outcome, in my view, with a much lower risk of mistakes. More focus should be put on getting the DID spec registries governance process into a shape that would address any of @jonnycrunch's remaining concerns. See @OR13's comment for a potential path forward (and @iherman's other option as well).

    @msporny
    Copy link
    Member Author

    msporny commented Jan 29, 2021

    @jbenet wrote:

    can you point us to the timeline expectations?

    Everything needs to be done by February 9th... which is a completely unworkable timeline for everyone. To be clear - the DID specification has been ongoing for 4+ years, we've been feature-frozen since July 2020, the Working Group has had repeated warnings about the deadline in two weeks... so it's not like this date is a surprise to anyone in the Working Group. Everyone was expected to get their features in a long time ago and have everything locked down by February 9th.

    I think everyone that's new to this thread needs to take a deep breath and understand what you're asking the Working Group to do and what their position is on this topic. Here is the last meeting we had on the CBOR section and the DagCBOR section. If you're going to insist that the core specification needs to define application/did+dag+cbor, you will need to understand what you're asking for (and that the Working Group has proposed an alternative workable path forward):

    https://www.w3.org/2019/did-wg/Meetings/Minutes/2021-01-28-did-topic#section2

    Also, to be crystal clear: No one is trying to prevent DagCBOR from happening. Many of us in the group are fans of IPFS -- I'm the spec editor for multibase and multihash at IETF. I get it, we all like PL and IPFS and the permanent Web. However, this is not the way to engage with the group -- I have no idea if @jonnycrunch represents the IPFS community, but these actions come across as an anti-social attack on Working Group consensus.

    The Working Group has proposed a clean way for DagCBOR to happen while not endangering the entire DID global standard. The meeting transcription above shows that the Working Group that participated in the discussion, except for @jonnycrunch, has signaled that they believe this is the best path for DagCBOR at this point... and it prevents no one from achieving their goals.

    The proposed path forward is for a separate DagCBOR representation specification to be published so that everyone can slow down, take their time, and do the work necessary to put together a solid DagCBOR representation specification. That representation will then be registered with all the other representations (JSON, JSON-LD, CBOR, etc.) on an equal footing.

    Would the proposed path in the previous paragraph work for the IPFS community?

    @jonnycrunch
    Copy link
    Contributor

    as an anti-social attack on Working Group consensus

    That isn't very nice. As a member of W3C I am participating in the process of developing
    a standard. Sometime we have conflicts and differing opinions. That is expected. So, I respectfully disagree with your defaming characterization.

    @dlongley
    Copy link
    Contributor

    @jonnycrunch,

    That isn't very nice. As a member of W3C I am participating in the process of developing
    a standard. Sometime we have conflicts and differing opinions. That is expected. So, I respectfully disagree with your defaming characterization.

    I don't think there was any ill intent in your objections. I think there are frustrations in the WG that the realities of the WG timeline aren't resulting in us clearly and quickly moving forward in a direction that won't jeopardize all of the work. If we can acknowledge that WG has these deadlines and that we:

    1. Don't want to produce a DID Document DagCBOR representation that is flawed due to haste.
    2. Don't want to accidentally limit other CBOR representations due to haste.
    3. Want a good DID spec registries process to enable continued innovation and interop.

    Then I think we can see that these things all support a common and helpful path forward. A good DID spec registries process enables us to add more interoperable tech to the DID ecosystem without requiring everything to be on the same timeline as DID core. There are already a number of other things that we could argue about trying to fit into DID core at the last minute that we don't have to -- because we have put them into the registries and linked off to other specs that can evolve on a timeline that better suites their needs.

    @msporny
    Copy link
    Member Author

    msporny commented Jan 29, 2021

    @jonnycrunch wrote:

    I don't think there was any ill intent in your objections.

    +1, similarly, I don't think you're acting in bad faith, @jonnycrunch. To be clear, my entire statement was: "these actions come across as an anti-social attack on Working Group consensus." -- repeatedly being the only dissenting vote while providing no alternate path that achieves a higher level of consensus is frowned upon. Objecting to Editorial modifications to the specification while not stating the grounds of your objection are frowned upon. Asserting process violations against Editor actions when there are no process violations is frowned upon.

    While I believe you're doing what you feel is right, it is simultaneously endangering the 4+ years of work that many in the WG have put in; that is not being looked upon favorably. You are, perhaps unconsciously, putting your needs above the repeatedly stated needs of the group.

    Your actions are misguided given your goals; you are (probably unknowingly) acting against your own interests and the interests of the IPFS community. We are trying to help you achieve your goals, yet you keep insisting on solutions that the WG has said are unworkable. At this point, multiple people in the WG have raised concerns with the way the DagCBOR issue is being pushed on the group; this is a problem. A non-trivial number of us are stating very clearly that it is a problem; and we are proposing a concrete solution that should work for everyone. If you disagree with that solution (where you were the only dissenting opinion in the WG), then providing an alternate solution that would gain a higher level of consensus would be the appropriate course of action at this point.

    @w3c w3c locked and limited conversation to collaborators Jan 29, 2021
    @brentzundel
    Copy link
    Member

    The chairs have decided to lock this conversation.

    We will address this issue during our working group call next Tuesday. Please do not add any additional comments here.

    @msporny msporny removed the needs special call Needs a special topic call to make progress label Feb 2, 2021
    @msporny
    Copy link
    Member Author

    msporny commented Feb 11, 2021

    The DID WG made the following consensus-based resolution last week:

    DID WG Resolution: The DagCBOR representation will be moved into its own specification and registered in the DID Spec Registries.

    https://www.w3.org/2019/did-wg/Meetings/Minutes/2021-02-02-did#resolution1

    PR #593 implemented the Working Group resolution. The DagCBOR specification has been decoupled from the DID Core specification and is free to progress at it's own speed.

    This issue has been addressed wrt. the DID Core specification. Closing.

    @msporny msporny closed this as completed Feb 11, 2021
    Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
    Labels
    pr exists There is an open PR to address this issue
    Projects
    None yet
    Development

    No branches or pull requests