-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Concerns regarding testability of DID Resolution and Dereferencing #549
Comments
On behalf of Digital Bazaar:
We are not planning to implement software that will pass the normative statements related to Resolution and Dereferencing at this time. We prefer that such work is done in a future DID Resolution WG. We do have DID Resolution software that we have authored called did-io.
We are not committing to writing any Resolution or Dereferencing tests at this time.
We would be supportive of moving the sections into a separate document as a NOTE as that will help signal to W3C Membership that a future WG for DID Resolution would be a logical next step and de-risk the Candidate Recommendation process. |
I've never understood the link between normative statements and testability, as terms like SHOULD, IMO, adequately express obligations that may be untestable, but nevertheless are defined with enough clarity for those obligations to be fulfilled, potentially by human-mediate processes that are fundamentally untestable. This is especially clear when we have already agreed that DID Core will define an interface ONLY for resolution and dereferencing, while leaving the implementation details to a later spec. If we can't even define the contract for resolution and dereferencing, then we have, at best, only an illusion of interoperability defined in DID Core. |
I agree with @jandrieu (and that is also a comment on #550). The 'at risk' flag refers to the possibility of putting the whole of §8 in a note, which would make the core document fairly difficult to understand and, I am afraid, would harm acceptance. I would think that, in case we have difficulties to get implementations, the alternative is to mark this section as non-normative, but keep it in the document. It may sound like just a nuance, but I think it is more than that: the reader of this specification would have a much stronger incentive to follow the statement in §8 and, eventually, implementations will follow. A, at first glance, independent separate Note has a lesser weight imho. |
B.t.w. if there are (min 2) implementations that declare that they do the implementations as described in some way or other (the language API is not defined, the whole definition is abstract anyway) then we may accept that as an CR exit criteria per se. There is no obligation to back up everything with explicit, externally executable tests. The goal of the CR is not to test implementations. The goal of the CR is to prove that the specification is (a) consistent and clear and (b) that all features are implementable (or used when it comes to, say, a vocabulary). |
Let me try and highlight why machine-testability of statements are an important consideration below. Over the past decade or so, many of the WGs that I have participated in try to make sure that at least the MUST statements have tests in the test suite and the SHOULD/MAY statements are written in a way that is testable by a machine. Humans make terrible spec enforcers for at least the following reasons: 1) the ones that know what they're doing are in high demand, 2) some of them are not consistent when language is vague, 3) people are clever and will get around human tests if they want to, and 4) the ones that are involved now tend to move on to other things/retire/die/etc. This is based on deployment experience where non-testable statements tend to be completely ignored by implementers (because there is no negative outcome for doing so) and they then in effect, do nothing and the real interoperability ends up being around the tests that are written for the test suite. Making sure normative statements are testable is the easiest path to ensuring interoperability. You can get there through other means, but depending on the good will of implementers is rarely a path to interoperability.
Doing so is asking for trouble. As a concrete example... I feel extremely uneasy when reviewing DID Methods for inclusion into the DID Spec Registries -- the DID WG has put this responsibility largely on @OR13 and my shoulders and I know I'm not doing enough to push back on bad DID Methods. I'm not pushing back hard because the human-enforced rules for getting into the registry are vague, and meeting the minimum bar is really easy. For example, we don't say "You can write total and complete garbage in your Privacy Considerations section, you can lie, you can say things that are fantastically dangerous from a privacy perspective, there is no mention of needing to do a 3rd party privacy audit, etc." -- so, that's what human-enforceability gives you... a protection that is small, vague, and largely unenforceable. |
@iherman wrote:
I disagree, I don't expect it would harm acceptance any more than marking the entire section as non-normative would. I will also note that having a section that is non-normative that has 40+ normative statements in it is confusing to implementers -- if we're going to do that, we should downgrade all language to be non-normative (which would be worse than publishing a NOTE that contains the normative language and provides guidance).
While you're technically correct, I'd rather we stay away from spec lawyering and have a higher bar for what our expectations are wrt. implementability of the specification... and now to quote from W3C Process (for the benefit of those that don't know it by heart): https://www.w3.org/2020/Process-20200915/#transition-cr
My expectation is that we will demonstrate that each feature of the current specification is implemented by pointing to tests for each normative statement. Where we can't point to a normative statement, we'll point to people following the normative requirement (e.g., point to at least two DID Methods that contain Privacy Considerations sections and @OR13 and @msporny asking people to put those sections in their DID Method before we'll allow the PR to go through to register their DID Method). I expect any other normative statement that can't do the things above will have a hard time meeting the "must document how adequate implementation experience will be demonstrated" bar in the W3C Process. |
Just for the sake of arguments:-) We have, in our specification, five different verification relationships defined normatively as properties. We can (and will) have tests for all of those, in the sense of having tests that check their values, constraints, etc., etc. However, I believe we will also have to show that all those terms are “meaningful”, i.e., that they are used in real implementations (and not only toy implementations, obviously) out there, and that for at least two implementations it makes a difference whether a DID Document uses, say, That “usage” is not a matter of (mechanical) tests. It will be based, at the end of the day, on human reporting. And that is all right, because we are not testing the implementation, we are not creating some sort of certification flag for implementation, i.e., there is no incentive for humans to cheat the system. All implementations have a common goal here: to make it sure that the specification is o.k. This is all to say that a purely test-centric view may not cover all the needs for this spec. Whether this is true for the content of §8 is not clear to me, but we may need some flexibility in how we define our exit criteria. |
I believe we can do this for all core properties EXCEPT FOR
I wish this were true -- everyone has a pet feature that they want in the specification. Some of these pet features don't have multiple working implementations behind them. I would prefer that the group requested demonstrations of working code for every feature by the end of CR as exit criteria. I understand that not everyone wants to set the bar that high. |
.... then they loose. It is up to us to set the bar even higher: we can require not two, but three or four implementations to use a feature when it comes for human response. But that is irrelevant as far as this issue is concerned. My only reservation is that the "at risk" note is formulated in a way that, if requirements are not fulfilled, then the whole of chapter §8 is removed from the spec. I would like to be a bit milder and keep the door open for leaving §8 in the spec, albeit as a non-normative section. At this moment, that is all. |
I plan on writing tests for "resolveRepresentation". I consider it a huge failure to have a spec that recognizes 3+ representation formats and then fails to explain (with tests) how they actually function.... Sure "complex" resolution topics can be punted to a future wg, but not the basics associated with the why created an abstract data model.... if we don't test those, we should not have an abstract data model and 3 representations... it is trivial to provide tests for resolution, you have:
normative statements about the relation between the input and fields in the output. in the case of "dereferencing"... its also trivial to test.... here is a DENO package manager that uses DID Documents, and service and relativeRef... https://github.com/OR13/deno-did-pm What we have in the did core spec should be well covered by tests, or moved to another document. |
Yes, for a concrete function/API. We don't have a concrete API -- we have abstract functions. You can't test abstract functions unless you make them concrete. If the group wants to define a concrete interface, that's fine, but some might argue that doing so is going to far. We had specifically mentioned that implementation of the abstract functions are up to another specification. Now we're saying that we're going to define the abstract functions (and, again -- people might object to that). I know that I wasn't expecting us to write concrete tests for resolution/dereferencing -- I'm not going to object, but this is exactly what I was concerned about happening when we started going down this slippery slope. We're at a point now where we need someone to write the tests... and we need two entities to write DID resolver implementations. It's basically going to come down to that. If we get that, we're going to have to change the spec to normatively state that these functions are no longer abstract -- they're concrete, and I'm not sure that the group agreed to go that far.
Agreed -- we need a volunteer to write those tests and take it through the CR process. We need two organizations that are going to volunteer to write implementations. I hesitate to put the burden on @OR13 given everything else that's on his plate. If this is not done and we don't have at least two interoperable implementations (and we don't have the warning), we will have to go through CR again (and we can only do that 2-3 times before the axe comes down on the group). |
PR #550 has been merged, which addresses this issue by marking resolution at risk and providing multiple ways in which the WG could address this concern during CR. If we want to test the section, we have to specify a concrete interface that can be tested. I'll put in a PR to do that as well. |
PR #601 has been merged, closing. |
A few items of concern:
At this point, we need one or more people to step forward to write the tests for the DID Resolution and Dereferencing section and we need at least two companies that will commit to implement DID resolvers that will pass the tests. If we can't get that sort of commitment (I don't even know how we'd write the tests with our current test suite -- maybe @OR13 has some suggestions), we should mark the section at risk and say that the Resolution and Dereferencing section may be moved into a separate NOTE published by the group.
My recollection is that the group desired to know how the interfaces for resolution and dereferencing would work. I believe the group has achieved that with the current specification and we could safely move Resolution and Dereferencing into a NOTE for publication. This would set the stage for a future DID Resolution WG without having any substantive negative impact on the understanding of how Resolution and Dereferencing are meant to work. Publication as a NOTE would have the added benefit of relieving a good chunk of CR implementation pressure.
So, things we need to know from the group in the next week or two:
It would be good to hear responses from @peacekeeper and @jricher, as well as implementers.
The text was updated successfully, but these errors were encountered: