-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can we nuance our mental model on DID control slightly? #233
Comments
This would impact issue #122 as well - in fact, it would force the selection of 1b and 2a. |
There is always a controller. So 1b doesn't work. At the bare minimum, whoever created the DID is the controller. It does not imply they are, in any way, a controller of the DID Subject. All it means is that they controlled the initial DID Document, and presumably--depending on the method--retain the ability to further modify the document. In the malware use case, I believe a better way to model that is that the initial reporter generates a DID and issues a credential saying that malware with a given hash has been given such and such a DID, perhaps with other corroborating claims in that credential. No control relationship needs to be established between the DID Controller and the DID Subject. But no matter what the relationship between Subject and Controller are, there is ALWAYS a Controller, whether or not there is a controller property. Alternatively, the malware discoverer could just issue a credential with that hash, without using DIDs. I'm not sure DIDs buy the use case much. |
@jandrieu : Let me challenge "there is always a controller" a slightly different way. If a DID doc is created (said differently, if any metadata is associated with a DID), then I agree that there is always a controller, at least at the outset. Whoever creates the DID doc (chooses the metadata) is the controller. They can make decisions about whether to continue being the controller, or to disclaim control over the content by removing control methods. Furthermore, I agree that all our thinking up until now has assumed that DIDs and DID docs are inseparable concepts, because <assumption>of course we want metadata</assumption>. But what I'm suggesting is a scenario where an identifier needs to be created (perhaps better said, it gets discovered) without a DID doc--zero metadata--from the get-go. Nobody is allowed to define any metadata about the identifier -- even at the outset. The need here is a pure identifier that has the decentralization characteristic of DIDs, but not the resolution characteristic. It's almost like a hashlink, except it makes no claim about location (or any other properties), only about existence. What is known about malware (metadata) could vary in thousands of different ways, and be stored in databases all over the world, and nobody intends to be an authoritative source for any of it. They just want to agree that they're talking about the same thing. Full stop. And since the mechanism for generating the identifier derives it from the subject, there can never be any controller making decisions, by definition. Uncontrollable things exist, and we identify them. Since there is no resolution, there is no controller. That's what the controller controls--the DID doc/resolution, n'est-ce pas? Now, you could say, "No. We must have a DID doc. 99.999% of DIDs are worthless without it. For the weird corner case you're bringing up, if it really exists at all, just keep the convention and live with the weirdness." That would mean that we analyze the first person to create the guaranteed-empty DID doc for malware X as its "controller" for the purposes of the DID ecosystem. But two researchers could discover it independently, with no way of proving who was first. So we have a theoretical controller role that is unavoidably ambiguous, just because we want to keep the concept of controller. If we instead say, "Yep, there's cases where something exists but its metadata is not controlled, and DIDs can point to them. In such cases, it becomes impossible to create a DID doc (because if you do that, by definition you're exercising control), but it's still a sort of DID because it's a decentralized identifier" then we get to broaden the conceptual tent of DIDs a bit. |
It is not clear to me yet how this interacts with methods - perhaps some methods are capable of representing passive, persistent objects, and other methods are not - because the owner of a set of keys may always be able to update "the DID document". On the other hand - they can't update the did itself once it is minted - and that is what matters. So the controller (of the registration) might only be able to update information about the subject's registration record. Another thing that I worry about is that a In other words, for methods where the This is especially apparent in the context of "verification methods" (i.e. #190 ) when combined with the new abstract-data-model/registry approach. The ADM/Registry model forces a "union of all possible realities" model and results in very complicated modeling. For example, we will need to put this sort of information in the registry:
or the ADM/Registry needs structure like
use of an What this suggests to me - in answer to @dhh1128 's question
Is that it is far from clear whether or not DIDs are suitable as generic identifiers for self-certified content. Perhaps DIDs are always and only statements by actors, about people, organizations, and things - which means
Alternatively we could move the semantics partially into methods. Perhaps we could have did-core define a set of "classes" of DIDs - each with it's own ADM/Registry/
The most radical suggestion would be to step out of the battle altogether, and give DID-documents a sort of "sovereignty" and let them announce what they are and how to process them using some sort of attribute that identified and advertised feature and property sets. The proposed attribute would let the creator of the DID-document assert things like
and on and on - at the discretion of the environment and suitable to the needs of adopters. We could even say "if a DID-document says nothing, then it is assumed to follow the rules in did-core" and provide a fallback Abstract Data Model that clearly defines what it ought to be. |
@jandrieu re: 1b - i believe 1b is specific to the controller attribute in the DID-Doc, not the qualitative ability to control the DID-doc, simply the explicit representation of it in the DID-doc. if there is always a controller, then the hypothesis that starts this is not possible. DIDs can not represent immutable content, they can only represent loci of control - and as such they can not really refer to things - they can only refer to the controllers name for things. In other words - "The Moon" can not be the subject of a DID I create, "What Eric Thinks of as The Moon" is a proper scope, but "The Moon" is not. |
Quoting Joe from issue #122 :
So if my theoretical DID method exists, in which it's impossible to place any metadata in the DID Document, or to create one at all, that would imply that there cannot be a controller, because nobody can perform the function that satisfies the definition. This begs the question, of course, is the DID method that I posited allowed to exist? I can think of many use cases for it. It's highly decentralized (would score great on many rubrics), but by its lack of resolution support, it is definitely an odd duck. |
What you are talking about is not a DID. It's just an identifier. Obviously, there is still a discussion going on about what constituted meta-data. And, to my mind, I want ALL meta-data out of the DID Document. What needs to be in the DID Document is the cryptographic material for secure interaction (everything else is meta). In some cases, that material can be deterministically derived from the DID itself, like with did:key, in which case resolving the DID is how you transform the raw DID into the DID Document. I think a big part of what's happening right now is people wanting to do EVERYTHING with DIDs, and I agree DIDs can refer to ANY subject. But that doesn't mean they are the right tool for every single identifier use case nor is it appropriate to pollute the core spec to support convenience features. They can be addressed in DID-AWESOME instead of DID-CORE. If your identifier is most appropriately generated by hashing the object, GREAT. Just use that as an identifier. No DID required. The fundamentally topological shift in DIDs over other forms of identifiers, including cryptographically verifiable ones like public keys, is the level of indirection between the DID and the cryptographic material, allowing for appropriate maintenance like rotation without invalidating the DID and auditing of transitions in material over the lifetime of the DID. Without that level of indirection, which is the fundamental link between DIDs and DID Documents, then you don't have DIDs, you just have an identifier. |
@ewelton wrote
That's all it ever could be. The singular notion of "The Moon" doesn't exist. That is just what English speaking people, aka Eric, sometimes use to refer to the Earth's natural satellite. Other people use other terms. This is the fundamental shift that VCs gaurantee. All you can ever say are statements that "some issuer asserts some 'fact'", which is exactly the structure above. This is epistemologically rigorous. Imagining that "The Moon" is, in absolute knowable truth, the subject of a given DID is not. In order for such a statement to exist, we would first have to rigorously understand what "The Moon" really means to you. Then what it really means to me. Then we might be able to convince ourselves that we are talking about the same thing. It's the same with DIDs. The only way to know if the subject is what you think it is (unless you are the controller) is to gather enough assertions about that DID to convince you of what the Subject is. And EVEN then, all you have done is convince yourself. Reality is fundamentally unknowable. All we can do is invest resources convincing ourselves of enough shared agreement to interact reasonably. So, this isn't about a search for Truth with a capital "T". That's a fools errand. Rather, DIDs are a rigorous mechanism to establish cryptographically secured interactions with an arbitrary Subject. Figuring out what that Subject is or is not happens at another layer, including the mechanisms that embody what it means to "interact" with the Subject. |
@jandrieu I believe there is more to it than 'just an identifier' - it is more than a UUID, because it is linked to the thing itself. It is suitable only to 'hashable' objects, and not physical objects. You can't hash a tree, you can't hash the moon - and, you can argue that you can not refer directly to "the moon" - there is a huge tradition in philosophical semantics about exactly this - and DIDs, in a sense are taking a deep philosophical stance. So far - what seems like it works is this: In a sense, it does not matter where this falls - just as long as it falls somewhere and leads to clear and precise (and simple) language. So "the subject is the king of england" for example, would not be quite optimal "actor-x's name for the king of england is did:123" would be the right way to say it. |
Yes. That's what DIDs always say. But since we ALSO don't know who the Controller is, the statement "Controller's name for a thing is XYZ" is rigorously restatable as "A thing is the subject of DID XYZ" The assertion that DID XYZ refers to the King of England goes in a VC if you want it to be rigorous, in which case you get the lovely construct that "Issuer ABC says DID XYZ is the King of England". |
@jandrieu Nah, I don't quite agree with that. I would agree that saying "A thing is the subject of DID XYZ", while technically rigorous, leads to exactly the sort of miscommunication the community has been having. I'm not sure I follow the VC comment. Who is ABC and how is ABC related to the construct? What I'm trying to get to is making it clear, in everyday language, so that it is always apparent that A thing might have dozens of DIDs, because DIDs are "scoped" by controllers - and DIDs can not always serve as points of coordination in a discussion. What we want for the VC case, and what is being discussed here, is that - given the limitations of DIDs and the incorrect statements about their scope for the last few years - is a new form of identifier that can be shared by communities, and around which we can clearly say "The controller of DID XYZ says DID XYZ refers to N" and "The controller of DID ABC says DID ABC refers to N" and then let DID ABC and DID XYZ rest happy that they are talking about the same N, so that they can have fruitful discussions about attributes of N, such as "cn=King of England" vs. "cn=King of Great Britain" |
Okay. I think Joe's given a succinct articulation of a position on the proper scope of DIDs. Thank you, Joe. I love the crispness. I would like to ask for two things to resolve this issue:
Before we poll the group, however, I would like to offer an alternative formulation to Joe's. I don't know if I can be as crisp as he was, but I'm going to try. Going into this, let me acknowledge that the following is heresy, according to the spec; I'm only articulating it because I wonder if we're missing an opportunity here, if we could let go of tightly held notions a little. Here's the alternative worldview: Lots of identifier schemes already exist. They have various properties. DIDs are unique in that they accomplish ALL of the following goals simultaneously:
UUIDs accomplish goal 1, but not goal 2 or 3. A given UUID can mean anything, to anybody. Fred can create it, and Jill can repurpose it. They can argue about who's right, or whether they're both right. There is no strong binding to anything in particular. Most decentralized identifiers (e.g., the names of newly born children) are similar. IP addresses accomplish goal 2, and sometimes goal 3, but not goal 1. Most centralized systems (twitter handles, phone numbers, domain names) are similar. DIDs accomplish goal 1 in lots of clever ways that I won't go into here. DIDs accomplish goal 2 in one of these ways: a. They use cryptography to bind the identifier to a controller. The controller then defines what the identifier refers to. This was the original use case for DIDs, and the one we've thought about the most. b. They define some other intrinsic property that is objectively observable, that derives the value of the identifier, such that it is impossible for the binding to be ambiguous. A DID that identifies each element in the periodic table by its atomic number would eliminate ambiguity without having cryptographic control, while still remaining decentralized, and while still being enough of a DID to be processed by DID handlers. Notice that in this formulation, cryptographic control is a means to an end (eliminating ambiguity), not an end in and of itself. Notice also that cryptographic control is just a special case of the other approach (objectively observable property that makes the binding unambiguous). I think that's the crux of the difference between this worldview and the other one. DIDs accomplish goal 3 through the use of the DID method extension mechanism. Now that I've articulated an alternate worldview, here's the argument I'd offer in its favor: Although the world needs control-based binding for DIDs in the worst way, it also needs the other kind of binding (which I might call inherent binding). Both bindings are worthy of the moniker "decentralized identifier." UUIDs are not a good alternative because they lack the solution for ambiguity. URLs are not a good alternative because they lack decentralization of domain names. If we force the conception of DIDs to be narrow, we're setting ourselves up for a situation where another type of decentralized identifier comes along that has just as much claim to the word "decentralized", but that thinks about control differently. Result = muddiness and doubt about adoption. If we bring this ugly stepchild into the DID tent and let it take a bath, I suspect it will turn out to be cute and a good family member, in time. I don't think it would take much more than 2 or 3 paragraphs to talk about "uncontrolled decentralized identifiers" in the spec; they're way simpler than the controlled variant. |
Tagging a few people who may have opinions about this interesting conversation: @peacekeeper @dlongley @msporny @burnburn @brentzundel @talltree . Please bring in others as appropriate. |
@jandrieu I think i passed over #233 (comment) while I was writing my response. and to @dhh1128 's
I think this is exactly right it was what I was trying to capture with, what we want
in other words - there is a missing piece to the puzzle. DID's are not necessarily up to the task, unless there is some tweaking to the spec - some core, fundamental tweaking and clarity. So far, attempts to discuss the missing puzzle piece get blocked by discussion of controllers, subjects, and very obtuse technical issues. Those discussions have cut off the forest and the larger view has been lost. I like the idea of "bringing it into the DID tent, and giving it a bath" |
DIDs don't solve #2 In fact, I don't think #2 is possible in any construction. We can only clarify the DID and when we refer to the DID we can use an unambiguous string of characters. However, any statements can get attached to that identifier, by any author, and there is no way to know--at the DID level--which statement is "correct". Even if one of the statements is signed by the Controller, you can't be certain that it is "correct". Heck, you can't even prove the controller is the Subject. What you are bumping up against is essentially Goedel's incompleteness theorem. You can't disambiguate everything. There will always be statements that cannot be proven, no matter how convoluted our schemes may be. All we can do is anchor assertions by specific issuers to understand (and document) what they are willing to assert about a Subject, as identified by a DID. Statements about the same DID can be taken to be intended as statements about the same Subject, but even then the statements themselves may be wrong. Content-based hashes of arbitrary content are NOT DIDs because they cannot be resolved directly to some form of cryptographic material. You could, of course, create an IPFS DID Document and have a DID method that uses its content-based address, but that hash is of the DID Document, not of the resource. IMO, if we are going to get closure on this spec, we need to stop trying to add everything that seems like it might be convenient, and we need to stop trying to construct crazy edge cases--ESPECIALLY if you have no use cases for it (as you put it @dhh). Maybe others with more experience in standards development can chime in. I know that VCs almost didn't get done because of mid-process shifts to support ZKPs. The consensus was that was a good thing. But it still risked finishing within the required deadline. Kitchen sink engineering a solution that solves everyone's problems is, IMO, an anti-pattern in a standardization process. We need to be here locking down the simplest feature set for maximum interoperability to do the fundamental thing that DIDs do: enable cryptographically robust management of identifiers without reliance on central registry entities to keep track of who controls what. EVERYTHING else is superfluous and deserves a critical evaluation about whether or not we can remove it and still achieve the fundamental requirement of this work. EVERY add-on is another lengthy drawn out debate, additional implementation complexity, and yet another point of confusion for anyone who wants to adopt the tech. So, let's stop with the add-ons and start focusing on what we can do to minimize the complexity rather than exploring how we can extend DIDs to do extra magic. If DIDs can do that magic, it is perfectly fine to add that at another layer or in the next iteration of the spec. |
@jandrieu would you then be backing this 1 - DIDs can not be used to identify digital content in a shared namespace does that seem right? |
Um... no. DIDs can identify ANYTHING. I've said this before, so I'm surprised you'd suggest I'd back that set of statements. |
My particular point here is that the are mathematical guarantees we can affirm with DIDs. That's what the cryptography gives us. Anything more than that which we can mathematically guarantee should be achieved at another level. |
@jandrieu ok, so it seems like we're stuck. It may not be possible to discuss DIDs. Either a DID subject refers to ANYTHING and NOT a name for a thing scoped by a controller. But I have a feeling that if I say that it represents ANYTHING then you will say that it is scoped by a controller. I am getting dizzy. If I was trying to describe DIDs to clients and customers (which i have stopped doing by the way) I need to be able to say something - if I say that "the subject is the King of England" to them without clarifying that there is a controller involved, they get the wrong idea. So I try to say "the subject is scoped by the controller" and then you say "no, I am suprised you said that" - I really am totally at a loss. A DId subject is both scoped by a controller and not scoped by a controller and it is sometimes anything and sometimes restricted. I just don't get which set of constraints are in play - other than jsut not what anyone else is saying. |
Perhaps you read #2 a bit too fast? I'm not interested in proving the correctness of arbitrary statements about an identifier. I agree that anybody can claim any attributes they want about anything, and that it's not useful/desirable for DIDs to facilitate that. In fact, the example scheme I proposed explicitly precludes the association of any statements with the identifier other than existence/scope of reference (the subject). I'm saying that it's a defining characteristic of DIDs that they prove the correctness of exactly one type of statement, which is an assertion about scope of reference -- and I'm claiming that is a generalization of the variant you like, which is scope as proved by cryptographic evidence. Control is only interesting as a mechanism of achieving the real goal, which is knowing with confidence what you're talking about. Your own verbiage "Even if one of the statements is signed by the Controller" presupposes that it's possible to ascertain truth about this subtopic; signing is just the mechanism for proving that the scope of reference is what the Controller, not some other entity, asserts. I think this is exactly what you meant when you said the DID subject can't be the moon, but can be what the controller thinks of as the moon. While it is true that eliminating all ambiguity is impossible, and on a philosophical level, we can't even prove that we exist rather than being figments of one another's imaginations, I am very surprised to hear anybody claim that DIDs don't provide practical clarity about what the referent is. Elsewhere you have claimed that the referent is whatever the controller wants it to be. That's an unambiguous binding. Yes, it can change. Yes, the controller can do a lousy or inconsistent job of definition. But the fact remains that whatever scope of reference is embodied in the controller's choices constitute exactly and uncontroversially the referent for a DID at a point in time, if the binding is based on cryptographic control.
I agree that bringing this up and tackling it is a tradeoff. Eric is not alone in believing that if we don't broaden our conception, important use cases are lost. But that could be the right answer, and I would accept it if it's the will of the community (even though I continue to disagree with your other argument). So I, too, am curious to hear how other people would weigh it. |
@ewelton I don't think we are stuck. We are just dealing with the fundamentals of what is knowable and what is provable. As such, we bump into issues of epistemology and Goedel's incompleteness theorem. There are bounds on what we can know and bounds on what we can prove. Any technology that purports to exceed those bounds should be considered with the same skepticism as claims of a perpetual motion machine. That said, it is a different issue how we talk to regular folks. In the same way that it is hard to explain why perpetual motion machines will never work, it will be hard to explain the boundaries of what is knowable and provable. |
@jandrieu I understand how you frame it and why you say what you are saying. But there are practical solutions to the problem @dhh1128 raised. More importantly, we just need to pick one and move forward. What you are saying is true, but I feel you are simply missing the point of what we are saying, and are convinced that this is because we fail to appreciate your point. The subject of a DID has no semantics - and, importantly, if the hash is cryptographically bound to the genesis key pair, then it CAN NOT serve the role of identifying digital content in a self-certifying manner. Instead, it can only be the "name" of a record that contains the target identifier. What we are exploring is a way to augment that environment - to make self-certifying content identifiers first-class citizens. This exploration is not about mathematical provability or Cantor's Paradise. In terms of did methods - we are starting to see 'strange methods' like did:key - which, one might argue, have a different relationship with 'controllerhood' than do blockchain-resident did methods with long running did-documents that can evolve over time and can engage in complex expressions of verification methods and service_endpoints. The option on the table is to recognize some of those differences - and instead of rage against them, decide if that variation can be co-opted and exploited. In a sense it does not matter which is chosen - as long as it is chosen soon, and precisely. There is a strong argument for disallowing this sort of "content-hash" immutable element - like |
An object in a decentralized network needs an identifier. The DID name itself "Decentralized Identifier" suggests that there should be room to include a solution in the DID spec. |
... and, for the record, semantic objects should never be governed in a decentralized network. That is why schema.org, etc. are open-access and free of governance. If semantics are governed they simply won't be adopted. |
I may be missing the point. I certainly don't understand what @dhh is trying to get at with disambiguating. But I also don't understand your previous comments. We can talk about DIDs and your suggestion that I would support those three items you listed made it seem like you didn't understand my point. If you do, great. I really don't understand how #2 is accomplished, in any identification architecture.
@dhh later expands that to
I'm still not following. The referent is not scoped by the DID. Rather, a link to a certain set of cryptographic material is provided by a DID Method after resolution. That's it. What's what DIDs do. Resolve up a DID and you'll get some cryptographic material that can be used to interact securely with "The Subject" whatever/whoever that is. Maybe it is the controller. Maybe it is not. It isn't well scoped at all. It can even change over time. It is completely ambiguous what it refers to. The only DID that resolves ambiguity is this hypothetical did:immutable. Which doesn't seem like a DID at all to me. So, yes, you can change the definition of DIDs to add something like did:immutable. But you can't say DIDs have a primary function of removing ambiguity--and then use that to justify an argument FOR did:immutable--because no other DIDs do that. Don't get me wrong: immutable ids are cool. iid:[hashtype][hash] seems like a reasonable thing to standardize. github.com/w3c-ccg/multihash seems like it's half-way there. I just don't think that's a DID in any sense that this community has been working on. Maybe I am missing something. In any case, I'm definitely not following the logic on how did:immutable and its kin is anything like other DIDs. Also... I'm not raging. I'm just disagreeing. DIDs are a thing. They aren't everything. They don't solve all the identifier problems. They are not the right identifier for every kind of thing that might need an identifier. They are a particular type of identifier that might be useful for certain things. Their key distinction is the ability to find the current authoritative cryptographic material for interacting with the Subject of the DID. Before DIDs, there was not a particularly good way to find such material, not in any definitive way, without reliance on a third party. PGP's web of trust was the best prior art in this area. DIDs are a huge advancement in the usability of cryptography for a large number of use cases. It would be great if we could just focus on getting this fundamental innovation in the books, so we can turn our attention to building the amazing services on top of DIDs that so many of us are excited about. |
The name DID should really be DEI (Decentralized Entity Identifier). DID suggests that you can identify anything in a decentralized network. If an object identifier cannot be accommodated, the name DID is misleading which is a shame. We would also have to build out an entirely new standard for a DOI (Decentralized Object Identifier) which of course can be done. In an ideal world of DIDs for everything in a decentralized network, you would have Which way are we going to go? |
@jandrieu : I think we are talking past each other because we are talking about different manifestations of ambiguity -- and it might be because of my own clumsy language. If so, I apologize. Let me try again. And let me step away from DIDs for a minute; maybe a different context will help. Suppose, one day, that Alice invents a brand new word: "habapookajar." She's at a party, and she applies it as an adjective to a person wearing expensive Italian clothes. Those who overhear her are pretty sure it means something sort of like "sophisticated" -- but they're not quite sure. Her meaning is ambiguous. Even if they ask Alice what she means, there's no guarantee she'll tell them the truth, or be able to give them a definition that perfectly embodies her intentions. This is ambiguity, and I believe we're in alignment in suggesting that it's fundamentally unresolvable. Let's call that "type 1" ambiguity for a moment. But at least we know who's the definitive authority on the meaning: Alice. Whatever she says it means, we have to accept. There's no ambiguity about that, right? Or is there? Suppose there's another party a week later, and Bob is overheard using this word. Someone asks him if he got it from Alice, and he says "No, I invented it. Who's Alice?" Although all ambiguity has things in common, this new ambiguity feels like it's worth putting into a second bucket. Let's call it "type 2" ambiguity. This is not ambiguity about what the word means; it's ambiguity about how to approach learning the word's meaning; we don't even know where to start. No identity systems can resolve type 1 ambiguity. A centralized system resolves type 2 ambiguity because the system is the acknowledged authority on the question of what the identifier refers to. That doesn't make the identifier's meaning perfectly clear (nothing can) -- but it removes any ambiguity about how to learn more. But type 2 ambiguity has always been a big problem in decentralized systems, because there is no such authority. Part of the genius of DIDs is that they solve this problem. That's a hugely valuable innovation. We've explained that innovation in terms of cryptographic control, and if we choose to, we can continue to explain it that way. We can say that the problem is proving control, and the solution is cryptography. But what I'm suggesting is that we can define the problem in a slightly more general way, and that this might have nice consequences. It would be a tradeoff, as you say. Old problem statement: How do I prove control of the identifier? New problem statement: How do I eliminate type 2 ambiguity? I admit that this new formulation is a departure from the official party line. The arguments in favor of it that I'd offer are:
I don't think these three arguments are a slam dunk argument in favor of what I'm proposing. But I'm hoping that at least my worldview and my comments about ambiguity make better sense? |
Spelt out, the two options are |
Thanks, @ewelton . That is a sound argument but, going back to my original argument of DIDs for everything in a decentralised network which allows us to move into a synergistic future with better naming conventions and smarter identifiers, I'm keen to keep investigating. @kdenhartog - Are you able to answer Eric's first question ... 1.) How is control surrender enforced? @mitfik - Are you able to answer Eric's second question ... 2.) How does resolution work (e.g. what is the relationship w/ an underlying registry)? Let's hammer that out before coming back to method naming. Just so I don't have to scroll back later on, can someone also give me a definitive answer on whether a DID method type should depict a function or a target? Thanks. |
We've said a lot of words here. I have tried to keep this brief (and failed). HOWEVER, I am responding with a different illustration of what I see as the defining mismatch between content-based identifiers and DIDs. This thread has shifted my sense of how we communicate what a DID is. Regardless of whether was adopt this new kind of DID as something we, as a standards effort want to incorporate, we should definitely update the language in the spec so the mismatch can be minimized for future readers. People have a hard time understanding how DIDs do what they do, which is vital to understand if they are appropriate for a given reader's needs. However the technical questions resolve, we definitely have a documentation problem. Here's what clicked for me as I was trying to understand how we are talking past each other. DIDs are a framework for cryptographically proving control over an identifier without relying on a trusted third party. This is what's new. This is what's different. This proposal to "nuance" our mental model abandons that and would create a new class of DID which is essentially uninteroperable with other DIDs. I'll call these CIDs for content identifiers, which have all the characteristics described by others. As I've stated many times, they sound awesome. They will be useful. It makes sense to standardize a way to use them. Consider the use cases document: First, two of the first four essential characteristics of DIDs are not met by CIDs:
#3 is not met because the hash provides NO way to demonstrate control. It only demonstrates knowledge of the associated content. #4 is not met because there is no derivable meta-data about the identifier. A CID has no mechanism to lead you to additional details that would allow the core functionality that define DIDs. In particular, there is no way to bootstrap a control framework just from a hash. Maybe I'm missing something on #4, but to my understanding, revealed knowledge cannot establish control in the way that secret knowledge can. If you must reveal the knowledge to satisfy the cryptography, as you do with hashes, you cannot prove anything cryptographically without ceding equivalent control to the recipient of the proof. It's a leaks control and therefore isn't suitable as a control framework. Second, of the 13 actions enabled by DIDs, only the first two are supported by CIDs:
CIDs can't be used to Authenticate, Sign, Resolve, Dereference, Verify Signature, Rotate, Modify Service Endpoint, Forward / Migrate, Recover, Audit, or Deactivate. Third, the reason DIDs are useful in decentralized identity is precisely because of the ability to demonstrate control. Not because they identify only a particular class of thing or because they can disambiguate anything. (FWIW, even @dhh's second definition of disambiguate wrt Alice's definition is unknowable and unprovable. Because people other than Alice can use the DID as a subject without getting confirmation from Alice that they are using it in the way that she means it. And even if they did, there is still the risk of semantic drift as Alice's sense of what she means evolves over time.) The way DIDs bootstrap digital identity, in the most typical use case where Subject==Holder==Controller (whether or not the issuer is identified by DID) is as follows: Two stages. First, you get the credential.
Second, you use the credential.
At this point, the Verifier knows that the current presenter of the VC has proven control over the same secret information as the subject, and therefore, with a specific level of assurance they can accept that the current presenter is one of the following:
We always have to allow for #3. That's the weakness in the system. However, the entirety of modern cryptography has this weakness, which is why keys MUST be kept secret if they are to have any use whatsoever. It is the ability to perform this proof of control that ties the issuance of a VC to its presentation so that a Verifier can have some proof that the party presenting the credential is, in fact, the entity given that credential, which to the best knowledge of the issuer was believed to be the subject of that credential. You could, of course, use a third party to demonstrate proof of control. You just ask Facebook who they believe is the current presenter. They'll use their own authentication approach then present their result. The whole point of DIDs is to enable this sort of bootstrapping of verifiability WITHOUT relying on the likes of Facebook. That's what makes DIDs unique and valuable. CIDs can't be used in this fashion. As such, they just don't do--CAN'T DO--the fundamental thing that DIDs were created to do. Yes, we can attempt to interpret the "decentralized" part of the DID name in the hope of supporting all the kinds of identifiers that can be rigorously created without a trusted third party, but, when we can't even agree on the meaning of the word "decentralized", that seems like a particular kind of madness. No offense to @dhh1128 @pknowl @ewelton or any other proponents of this idea. It's just that shoehorning an incompatible, non-interoparable notion of DIDs because of lexical similarity with an ill-defined term just doesn't stack up for me. That said, I do like CIDs. They have been implemented as URNs in several forms from urn:hash to urn:sha. The particular variation proposed here might deserve its own namespace, such as urn:cid or perhaps if it builds on multihash, urn:multihash. However, since
I can't help but come to the conclusion that CIDs are not DIDs. If it doesn't look like a duck and doesn't quack like a duck, it's probably not a duck. It might be a bird. It might taste delightful when prepared in the Peking style, but it still probably isn't a duck. |
There is some precedence for this. The DNS RFCs specifically exclude the .onion root domain (and a few others) from fully complying with the DNS standard. See specifically https://tools.ietf.org/html/rfc7686 -- Christopher Allen |
I'm sorry, is the proposal here to have a For example
Is that was you're suggesting @pknowl? |
@jandrieu I want to clarify - I am not a proponent of adding content based identifiers into the current model of DIDs. This is because of the two reasons I enumerated - lack of solution to resolution, and no way to fully "surrender control" - and "reproducing" simple urns but calling them DIDs is silly - and besides did:o:sha:123 doesn't assist resolution at all, because it is missing location information. One of the mistakes made in the DID model is the strange handling of resolution - DIDs contain some location information but rely on a bunch of secret hidden magic to make them resolvable. Resolution is critical, and leaving it out of scope is just part of what I consider "a long series of mistakes" beginning around mid 2019. Current DIDs have become defined the way you define them as the result of evolution of the community. DIDs were more open to flexibility and interpretation in the past. Alternative approaches to DIDs lost out in the sea of privacy, control, and decentralization voices - and that is fine. The rubric idea became myopically focused on decentralization, so we lost most of the structure for navigating the alternatives. The use cases became focused on what I consider a niche world. The collapse of semantic flexibility meant we got onto the road of "the one true DID" So, to be clear - I believe that there are legitimate use cases for these sorts of "non-controlled" and "verifiable" content-based identifiers. And I believe that 1 year ago would have been a great time to sweep them into DIDland so that we could build them into the resolution infrastructure. And I believe that the flexible semantics we had 1 year ago gave a very clean path to model this larger landscape inclusively and to the benefit of the global community. However, as of today, DIDs are more focused - they are much more specific thing, and that means that a spec will be produced and we'll get some nifty tools out. It also means that I think that getting these sorts of capabilities into the DID landscape, for the goals @pknowl identifiers, might not be viable today - the window has closed and it is time to work with the DIDs we have, not the DIDs we want. Maybe there is a way to shoehorn them into the authoritative model of DIDness, but it will take a cleverer person than me to do it. Don't get me wrong - there has been a lot of great work and thought behind DIDs-of-today - but DIDs are neither revealed truth nor natural law, they are the result of a negotiated specification that reflects the loudest and most energetic voices. Since those have focused on privacy paternalism, control, anti-correlation, and a particular interpretation of decentralization - that is what we have. I am excited to see a lot of the work that is going on, but these DIDs are just not that relevant to my use cases - there are alternatives which I can use today to deliver "improved-sovereignty" and "improved government and business processes" through the use of non-DID grounded credentials and capabilities. When DIDs are mature and in broad adoption, it will be easy to incorporate them into my world and further improve sovereignty - and I am looking forward to that. What makes DIDs strong for some people, make them weak for others - and that is normal. What is most important is that the spec stabilizes and is released. There is always room for adaptation in the next round of specs, and via alternative specs - so I support this effort to the extent that it does not derail or retard the delivery of a clear specification - whatever it winds up saying. |
Many thanks for pointing me to that link, @ChristopherA . Very much appreciated. @jandrieu - For our purposes, we're not interested in location, we just need to know that the content is immutable. Perhaps resolution characteristics and MIME-type would be held in the associated DID document. I would expect the
For example, if a non-governed object were moved from Drive A to Drive B, the identifier should remain the same even though the location has changed. @mitfik will certainly have some deeper insight into requirements and resolution. |
@ewelton - I'm also acutely aware that if we get the naming convention right at this stage for non-governed objects, the Semantics side of the model would remain stable despite the release of future versions of the DID specification. This is just as much about sustainability to the network going forward as it is to non-governed objects requiring a stable identifier under the DID umbrella. |
Actually, the precedence of allowing for some “special purpose domains” that do not need to fully adhere to the DNS RFCs is described more fully in Section 3 of RFC 6761. https://tools.ietf.org/html/rfc6761#section-3 The .onion domain RFC https://tools.ietf.org/html/rfc7686 describes more why this top level domain meets the I’d like to suggest that we support a similar carve out (like in RFC 6761) for how to register a “special purpose method”, but specifically do not add to our agenda to tackle specifying the nature of any such method. This allows the For could begin with registering those method that don’t support full CRUD by marking them as “special purpose method” in the registry, and the method only has to show why they qualify as such a method. — Christopher Allen |
@ChristopherA That does seem like a particularly useful way of sorting out some of the "stranger" methods, and perhaps keeping the door open a crack for at least playing around with novel ideas. If some of those ideas catch hold, they could make it into an future version of the spec itself - but they do not have to challenge the progress achieved by focusing DIDs, and they do not need to distract by requiring additions to the use cases. +1 ! |
I agree with your comment. Just wanted to point out that there's an interesting difference between |
I've gone quiet on this long thread that I started, but I wanted to say thank you to all the smart people who chimed in. Re. the final pair of comments from @kdenhartog and @peacekeeper : yes to the distinction Markus was trying to highlight. When you have a property that is objectively observable as the basis of an identifier, and everybody knows what property to look for, then you have the interesting phenomenon that multiple observers will automatically be led to agree on the identifier for the object -- even for new objects not yet discovered. This has some very desirable benefits in a decentralized ecosystem. Perhaps Joe is right that this doesn't belong inside the DID umbrella; I'm content to let consensus rule, but just wanted to make the strongest case I could for it. As the original opener of the issue, I am happy enough with the ensuing discussion to let it be closed now. But we can also keep it open longer if procedure or the preferences of others pushes us that way. |
I think for those who would like to update the mental model in ways that have been discussed in this thread, a concrete next step would be to:
|
@peacekeeper A DID using this method-to-be-named would still have a definition of the Create operation, no? It's just that the Create operation in the DID method spec would describe the special way in which DIDs using this method are created. RE naming, I thought the original proposal was for DIDs using this method to use the multihash format. If so, why not just call it |
@talltree I'm keen to name this method type The other argument for sticking with the "O" method type is that there will be a huge number of these identifiers woven into the fabric of the decentralized network. 50% of all identifiers (i.e. anything non-governed within the data capture side of the model) will contain this method type. To help people digest, adopt and ultimately scale this new identifier type, users could simply refer to them as "DID-Os". |
+1 to I think this is another interesting aspect in this thread. Almost all DID methods I am aware of don't restrict what is being identified. This one seems to have such a restriction, i.e. it can only identify what can be hashed. |
@peacekeeper I suppose the method name should reflect how the community sees the DID space evolving. I, for one, hope that the argument for the development of We have a rare opportunity to name the object identifier correctly right off the bat whilst hinting at an elegant DID syntax evolution for the future. Why wait for governed identifiers to align to the methodology. If the identifier name is set to If I'm missing something and |
@mitfik has just messaged me saying that he has a feeling that a non-governed object identifier may need to contain more than just a simple 'multihash'. On that note, I propose that the community hold off on a casting vote until the tech guys have had a chance to further investigate what identifier characteristics should be included. |
This is critical as I see it, because it is the presence of a controller that defines the semantic space within which the identified exists. I see that as a key strength of controlled DIDs. When you and I talk about the same thing using different DIDs, the only way that can coordinate is by presenting evidence from attached and found information - external claims, credentials, and the like which are linked to the controlled document. That is very valuable, however.... The reason these were of interest was that, like This is useful, for example, when pointing to a credential schema or context or other primitive from which one scaffolds deterministic processing in a decentralized data economy - it provides an "open authority" without simply using DIDs to create "a new root of central authority." I find the concept of a Bitcoin Anchored Semantic every bit as Centrally Controlled as schema.org. Hashlinks give us a lot of the power needed - and in particular they give us the thing that is missing from simply using I also remain concerned about the maintenance of hidden control - the 'create' method would effectively be a 'register' method - but register it in what infrastructure? - which gets, again, to resolution. And it is the infrastructure of the registry which defines the possibility of true "surrender of control" vs. "good samaritan waiving" - i think it makes sense to wait to name this concept until those elements are clear:
if we can not do these, then we have defined something equivalent to regular DIDs with a claim "this DID that I control is about urn:multihash:1234" - and those DIDs are fine, but they can not be the foundation for scaffolding semantic processing on a decentralized data economy - for that we need a decentralized identifier with broader capabilities than DIDs. |
I'd say there's probably a few things we could take from this thread as well to make as additions to the did core spec. Some of the arguments against this method have pointed to a few things that are left as tribal knowledge that I'm wondering if we could get normative, testable statements for. For example, one of @jandrieu point I felt was a pretty strong point. On creation of a DID it SHOULD (could be upgraded to MUST) be possible to prove limited control of the identifier via a cryptographic mechanism. Another one I've been toying around with is the idea of a minimum number of possible namespace entries. E.g. the method specific identifier must be able to identify at least 2^80 unique identifiers. I'm not sure this really adds much enforcement to the idea of the identifier not needing an authority to authorize access to the namespace. I also like @ewelton point about adding at least non-normative statements and normative statements if possible around surrendering control because I feel that was part of the crux of what makes this possible. @peacekeeper do you have any ideas around other things that might be worth adding for this? |
It's surrender at the point of creation by the intrinsic nature of the method. In other words, control of the knowledge is all that's necessary to create the method. Representation and proof of control is unnecessary after creation, just as it's unnecessary after all keys have been revoked in all other methods. |
I hope not, that makes the method name even more likely to centralize around a naming authority. |
It looks like the author of this issue feels satisfied by the discussion that occurred. Next steps for this can go one of two ways (potentially both) I would guess. @mitfik @pknowl and I can draft a strawman did method to explore what these immutable, surrender control on creation dids would look like, or we can begin to propose language to constrain what did methods are possible. Any opinions on which way to go? |
Thanks, @kdenhartog . I believe this is now in the capable hands of @mitfik and a couple others in the HCF tech group to start working on a strawman/draft spec. The workload has suddenly gone through the roof at this end which is why this stream has slowed down. That said, I think we have everything we need for now. |
I propose we close this issue then since the did method can be shared via the did method registry. Any objections? |
No activity since marked pending close, closing. |
PR #213 has generated an interesting comment stream, and I think some useful clarity. I am happy to have multiple smart people agree in writing to the concept that a DID can identify anything, because this flexibility seemed to have been excluded by some verbiage I was hearing.
Now I'd like to explore a subtlety around the concept of control. I will frame this in terms of a use case that I'm familiar with in cybersecurity and malware research, but I think you'll quickly see how it might apply to use cases brought up by others.
Malware researchers typically identify malware (viruses, worms, infected or malicious files) by a sha256 hash. The first time a particular sample is seen in the wild, a researcher hashes the sample and goes to virustotal.com or some similar site to see if anybody else has seen it before. If no, the sample is uploaded to the site's DB for all the world to look at. If it is already known, then the researcher has just made a second (or a third, or a tenth) independent discovery.
Now, suppose I wrote a DID method that was all about identifying malware with DIDs. The logical identifier format would be did:mymethod:hash-of-sample. With me so far?
Okay, now what are the control semantics?
What I have heard so far is that DIDs are always created by a controller, who can then (even in the genesis DID doc) choose to retain control or give it away (e.g., by specifying no control after the creation transaction). This makes sense for many situations.
However, that doesn't quite fit this scenario, because A) the researcher who reports the malware is never, at any time, in a "control" relationship with the sample's identifier, and would not want to be considered so; B) the identifier cannot have control semantics, even at its genesis transaction, because its derivation mechanism disallows it; C) the identifier doesn't have a DID doc. What's being identified here is content that exists, that is explicitly uncontrollable to begin with. Anybody who discovers the content will discover the same identifier. Two researchers could register the same content on two different systems of record and both would be equally valid and not in conflict.
So my question is this:
Would we be comfortable saying that DIDs can be used to identify such things, too? And if yes (which I hope is an easy answer), are we willing to not describe such a scenario as "the controller creates the DID" but rather "the DID identifies something inherently uncontrollable, so it never has a controller, even during creation; rather, it has a discoverer" (or something to that effect)?
The text was updated successfully, but these errors were encountered: