Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(WIP) records + merkledag specs #7

Merged
merged 7 commits into from
Jun 27, 2015
Merged

(WIP) records + merkledag specs #7

merged 7 commits into from
Jun 27, 2015

Conversation

jbenet
Copy link
Member

@jbenet jbenet commented May 29, 2015

includes also keychain types.

this is all very much WIP

### Serialized Format


(TODO remove this? use only protobuf?)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we are okay using only protobuf. since we version tag the repos and the clients and protocols, we can write in logic to handle multicodec OR protobuf later if for some reason we have to switch away from protobuf due to it being proven to be stealing money from the poor, or some other terrible thing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm increasingly dissatisfied by the annoyances of protobuf's shortcomings. have to trick it into doing things, like self description or streams.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jbenet are you familiar with EDN? It handles self-description pretty elegantly, and has a few binary serialization formats.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@greglook not yet! but it's made by @richhickey so it's likely exactly what i want. thanks for the pointer!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@greglook ah no, this is a text format. it's not optimized for binary rep. (unless i'm missing a page)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jbenet see Datomic/fressian and cognitect/transit-format for binary representations. The latter is not strictly EDN, but includes the same core ideas. Also this thread for a comparison among them.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great, thanks @greglook! though the farther this moves away from strict 1:1 mapping to JSON, the more of an adoption hurdle it is

@whyrusleeping
Copy link
Member

@jbenet is there anything on telling which of two records is the 'most valid' ?

var ProofOfWork = "proof-of-work"

// ProofOfStorage proves certain data is possessed by prover.
var ProofOfStorage = "proof-of-storage"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The distinction between Proof of Storage and Proof of Retrievability wasn't immediately obvious to me. After searching a bit I feel like I now have a better idea for how Proof of Storage works, but linking out to whatever you consider canonical docs or useful review articles for each proof type would be nice.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also wonder where these proofs are used

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but linking out to whatever you consider canonical docs or useful review articles for each proof type would be nice.

Yeah this is super raw and early. almost removed this section.

I also wonder where these proofs are used

  • this part (the proof types) is very very WIP. just a thought.
  • the idea is to have "proof objects" that we can point to, and have their type tell us how to process them.
  • a proof of storage could potentially be used in a provider record, to prevent spam. and they would be used in other things, like filecoin.

@jbenet jbenet mentioned this pull request May 29, 2015
// Order is a function that sorts two records based on validity.
// This means that one record should be preferred over the other.
// there must be a total order. if return is 0, then a == b.
// Return value is -1, 0, 1.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why restrict the return values? I'd recommend consumers just use > 0 instead of == 1 when acting on this information.

@jbenet
Copy link
Member Author

jbenet commented May 29, 2015

this should help describe what i mean in the keychain stuff:

@jbenet
Copy link
Member Author

jbenet commented May 29, 2015

(yay diagrams. if anyone knows a good web based programmatic + drag and drop diagram tool, lmk. (most of them suck)

@whyrusleeping
Copy link
Member

jbenet for the records, youre thinking this right: https://gist.github.com/whyrusleeping/8f2c206ac2fbc952fea2

@whyrusleeping
Copy link
Member

@jbenet re: a drawing tool, i use http://draw.io its pretty nice

@jbenet
Copy link
Member Author

jbenet commented Jun 3, 2015

type Record struct {
Scheme Link // link to the validity scheme
Signature Link // link to a cryptographic signature over the rest of record
Value Data // an opaque value
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here value is data, and above it is a link. Above we have data for the validity information, but here there is no place for that to go. Something is wonky here. If i have my way, it would look like:

{
    "Data": "validity data, to be interpreted after parsing the scheme link",
    "Links":[
        {
            "Name":"Scheme",
            "Hash": "pointer to schema definition object (really just a placeholder)",
        },
        {
            "Name":"Signature",
            "Hash": "pointer to cryptographic signature",
        },
        {
            "Name":"Value",
            "Hash": "pointer to records actual content",
        }
    ]
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and below, where we have the date time thing, that info would just be another link.

I was mistaken, All data associated with the validity should be put in the Data segment of the record node. (at least in my view of actually implementing this thing)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thing, is the signature separate from the validity data? I'm okay with that, just want to make the distinction clear.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Wed, Jun 03, 2015 at 06:46:24PM -0700, Jeromy Johnson wrote:

+type Record struct {

  • Scheme Link // link to the validity scheme
  • Signature Link // link to a cryptographic signature over the rest of record
  • Value Data // an opaque value

Another thing, is the signature separate from the validity data? I'm
okay with that, just want to make the distinction clear.

Just to keep the vocabulary consistent, @jbenet was putting the
signature under “correctness” not “validity” (since it's a
done-right-at-craft-time issue 1), although they both fall under the
“validity scheme” 2.

I agree with 3 that it makes sense to allow a given validity scheme
to place correctness/validity information in additional links and/or
the data block as it sees fit, so long as it doesn't clobber the
base-Record "Scheme" or "Value" names (and do those have to be
title-cased?).

As it's currently written up, the linking for a signature is going to
be weird, since what you really want signed is the record itself.
The whole “signable part” business reminds me of OpenPGP with it's
“unhashed subpacket data” 4. I played around with an OpenPGP
implementation while trying to rotate my main GnuPG key 5, and the
whole “unhashed data” business is a pain in the neck. Can we just
make signature objects first-class citizens and allow (require?) folks
to push signatures instead of records. Then the signature could link
to the record, which would carry the validity information and a link
to the record payload.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question 3, is there actually going to be anything in the 'schema object' ? will it even be an object? what should it contain?

wking added a commit to wking/ipfs-specs that referenced this pull request Jun 23, 2015
…blockstore

Copy the AllKeys() method from the blockstore [1] to the datastore
[2].  You can't implement it efficiently using the existing datastore
interface, so I don't know how you'd add it on top of a generic
datastore that lacked such a method.  We don't want to encourage
drilling down through layers and using what should be internal
implementation details.

The previous paragraph explained why we need an AllKeys() method in
the datastore.  We also need to expose AllKeys() in the blockstore
interface, so we can build garbage collection and similar logic on top
of the blockstore, without having to drill down to the datastore layer
to write those tools (see, for example, the IRC discussion from [3]
through [4]).

Also remove the Key argument from the blockstore's Put().  The backing
datastore need not be content-addressable, but I think we want to
require content-addressability for the block store.  However,
multihash gives us some choices for the hash function and digest size,
so the blockstore's Put does accept those (and then it computes the
hash internally).

Besides requiring content-addressability, I'd also require the
blockstore to only store serialized Merkle objects.  That makes
deserializing the content easier, and we've worked hard to make Merkle
objects sufficiently general that they should suffice for any data we
want to put into the blockstore.

I've also tried to clarify that the exchange-server doesn't have the
potentially expensive AllKeys() method by explicitly listing the
methods it does have.  We probably also want to extend the Get(Key)
response with optional "will send" and cancel information.  See the
optimistic transmission graphic in [5] for more on this.

[1]: https://gist.github.com/jbenet/d1fedddfef85f0c4efd5#file-modules-go-L162
[2]: https://gist.github.com/jbenet/d1fedddfef85f0c4efd5#file-modules-go-L122
[3]: https://botbot.me/freenode/ipfs/2015-06-23/?msg=42683298&page=4
[4]: https://botbot.me/freenode/ipfs/2015-06-23/?msg=42688156&page=4
[5]: ipfs#7 (comment)

License: MIT
Signed-off-by: W. Trevor King <[email protected]>
wking added a commit to wking/ipfs-specs that referenced this pull request Jun 24, 2015
Currently go-ipfs has (in routing/dht/providers.go) in-memory handling
for the provider listing.  That works well enough, but it seems like
we'd want to store this sort of thing in the generic record store to
avoid duplicate record-store-like code.  The problem with records like
this is that they're keyed off the multihash for the provided object,
but the records themselves will be created and signed by multiple
providing nodes.  That means we can't store a single signature as the
record-store entry (which provider would sign it?).  This commit adds
a record-list object that addresses this case.

The record-list object has Merkle-links to signatures where the link
names are the IDs for the publishing nodes (e.g. the providers for the
provider-list case, or wanters in wantlists, etc.).  Each linked
signature would have a signed payload with data containing the claim.
For example:

  I, <publisher-ID>, can provide <multihash-of-provided-object>.

or

  I, <publisher-ID>, would like to hear about changes to
  <multihash-of-wanted-object>.

or some more easily maching-parsable version of similar claims.

The Merkle chain of the record would be:

  record
    <publisher-ID>: <signature-ID>
  ↓
  signature
    Key: <provider-ID>
    Signee: <providing-claim-ID>
    …
  ↓
  <providing-claim-id>
    Data: <claim-payload>

Nodes that had reason to trust each other (e.g. to not forge providing
claims or to properly currate a provider-list) wouldn't have to fetch
and verify the signed data.  Nodes that had no reason to trust each
other (e.g. fetching a provider list from an untrusted node or
receiving a new providing-claim from an untrusted node) should aquire
and check the signature before using the providing-claim for lookup or
adding it to a provider-list.  Since most nodes won't trust each
other, I expect the signature, providing-claim, public-key, etc.,
packets would be passed around in one optimistic-transmit block [1],
and probably be optimistically hosted on the same nodes that host the
provider-list itself for that purpose.

[1]: ipfs#7 (comment)

License: MIT
Signed-off-by: W. Trevor King <[email protected]>
@jbenet
Copy link
Member Author

jbenet commented Jun 27, 2015

I'm going to merge this and continue improving.

jbenet added a commit that referenced this pull request Jun 27, 2015
(WIP) records + merkledag specs
@jbenet jbenet merged commit e0280e3 into master Jun 27, 2015
@jbenet jbenet removed the codereview label Jun 27, 2015
@jbenet jbenet deleted the iprs branch June 27, 2015 09:51
@daviddias daviddias added the IPRS label Mar 14, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants