Skip to content
This repository has been archived by the owner on Feb 27, 2023. It is now read-only.

adds support for raw key instead of hashed key #64

Draft
wants to merge 15 commits into
base: master
Choose a base branch
from

Conversation

rach-id
Copy link
Member

@rach-id rach-id commented Nov 28, 2021

Fixes #40.
It adds a keySize field to all SMT related structs and uses it to access values in smt.values map.

Maybe we should add more checks on the keySize so that the treeHasher, SparseMerkleProof, SparseMerkleTree etc, have all the same key size.

Or, we can keep the keySize only inside the treeHasher and no checks will be required:
Pros
Simpler codebase

Cons
keySize is more related to the trees themselves than the hasher...

Also, other checks would be needed when doing operations like Get(key) checking if the key has the correct size.

Update:
The keySize is now part of the MapStore and it is being passed around as parameter wherever needed.

@rach-id rach-id marked this pull request as draft November 28, 2021 11:38
@rach-id rach-id changed the title WIP add support for raw key instead of hashed key WIP adds support for raw key instead of hashed key Nov 28, 2021
@rach-id
Copy link
Member Author

rach-id commented Nov 28, 2021

@adlerjohn Please take a look and let me know if this is what is needed so I continue fixing the other tests etc.
Currently, only the TestSparseMerkleTreeUpdateBasic is working

@codecov-commenter
Copy link

codecov-commenter commented Dec 19, 2021

Codecov Report

Merging #64 (a0cc481) into master (5f13c0f) will increase coverage by 3.54%.
The diff coverage is 95.29%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #64      +/-   ##
==========================================
+ Coverage   85.62%   89.16%   +3.54%     
==========================================
  Files           6        6              
  Lines         466      480      +14     
==========================================
+ Hits          399      428      +29     
+ Misses         39       26      -13     
+ Partials       28       26       -2     
Impacted Files Coverage Δ
deepsubtree.go 71.92% <90.90%> (+10.81%) ⬆️
proofs.go 98.01% <92.59%> (-1.99%) ⬇️
smt.go 85.90% <94.73%> (+5.11%) ⬆️
mapstore.go 94.11% <100.00%> (+5.22%) ⬆️
treehasher.go 100.00% <100.00%> (ø)
utils.go 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6e634fe...a0cc481. Read the comment docs.

@rach-id
Copy link
Member Author

rach-id commented Jan 4, 2022

Questions:

  1. Should the ErrWrongKeySize be also a struct ? So the error message also contains the wrong key. Or no need for that ?
  2. Should we update the fuzz test or we no longer need it ? since it will always be a constant size key... What about generating tons of constant sized strings ?

@rach-id rach-id marked this pull request as ready for review January 4, 2022 13:23
@rach-id rach-id changed the title WIP adds support for raw key instead of hashed key adds support for raw key instead of hashed key Jan 4, 2022
@liamsi liamsi requested review from adlerjohn and musalbas January 4, 2022 20:12
@adlerjohn adlerjohn added the enhancement New feature or request label Jan 4, 2022
Copy link
Member

@tzdybal tzdybal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some comments in the code.

In general, I think that default behavior of the library should be preserved (enable users to use arbitrary keys), and raw keys support should be actually added (instead of replacing current way of operation).

This can be added as an Option, that changes the behavior of treeHasher.path method to no-op.

.gitignore Outdated Show resolved Hide resolved
helpers_test.go Outdated Show resolved Hide resolved
_, ok := sm.m[string(key)]
if ok {
delete(sm.m, string(key))
return nil
}
return &InvalidKeyError{Key: key}
}

func (sm *SimpleMap) checkKeySize(key []byte) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For SparseMerkleTree.values this function implies that all keys in SMT must have the same length. This is fine if user is going to hash keys before adding to SMT, but in more general case it's a very strong limitation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the SMT is by construction assumed to be perfect. Allowing different key sizes...doesn't really make sense? How would that even work?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought it should only support a fixed size of keys, no ?
The checkKeySize is not exported (If I am not wrong), it is only used internaly. What it does is verify the length of the provided key if it matches the key size or return an error otherwise. I put it in a function to avoid rewriting the same code over and over again.

Do you mean we can have any key length ? I guess that wasn't the case even before this PR, since it was using a specific hashing algorithm to hash the keys and the depth of the tree was gotten via hasher.size()...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I totally agree, that SMT require fixed key size internally. My only concern is that, previously our SMT public interface was a black-box/high-level/general-purpose-container type of thing (see README.md with _, _ = tree.Update([]byte("foo"), []byte("bar"))). This change is a complete pivot, where de-facto part of implementation is exposed to the user, and user has to be aware of this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

true, it is a breaking change. However, it will impact the interface only at the time of creation of the map stores and verification of proofs.
For addBranch, it was a mistake from my side to put the keySize as a parameter since it should be checked against the dsmst's map store and not against the one passed in params (will make the change shortly).

Previously, I think when verifying a proof, we were passing a hasher, from which we extracted the keySize. So, we were verifying a proof against a certain hasher and a keySize. Except that the latter wasn't exposed since it was extracted from the hasher. Now that the key size doesn't relate to the hasher, it's alright to pass it as parameter.

What do you think ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The actual function signatures of Get, Set, Update, Delete, etc are not changed, but there is an extra "contract" added (fixed length keys). While updating SMT dependency, user has to potentially update all invocations of those methods.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aaah, got you now 👍 makes sense

@rach-id
Copy link
Member Author

rach-id commented Jan 4, 2022

I left some comments in the code.

In general, I think that default behavior of the library should be preserved (enable users to use arbitrary keys), and raw keys support should be actually added (instead of replacing current way of operation).

This can be added as an Option, that changes the behavior of treeHasher.path method to no-op.

Thanks for the comments, will address them tomorrow.
For the functionality, I think the issue states replacing that behaviour, it's a breaking change, I believe.
Additionally, I think that the tree hasher should no longer be passed to these structures, as it is only hashing everything internally... What can be done is using a default hasher that can be changed using options, for example, but no longer be necessary.
What do you think ?

.gitignore Outdated Show resolved Hide resolved
smt.go Outdated
// This is a non-membership proof that involves showing a different leaf.
// Add the leaf data to the proof.
nonMembershipLeafData = leafData
}
}

proof := SparseMerkleProof{
return SparseMerkleProof{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this change idiomatic Go? I prefer the old version personally since it's clearer what's being returned without having to parse two concepts at once. The compiler will optimize away the variable anyways, if performance was the concern.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got this and similar ones flagged by a linter... Apparently, I am using a more exhaustive linter, maybe we should agree on a linter config for all the repos, and respect all the errors it is showing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm. How about revert these changes for now, then we can open a separate issue (potentially also to be implemented across the org) on linting?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good. Should I create an issue ? or just leave it for now ?

mapstore.go Show resolved Hide resolved
@tzdybal
Copy link
Member

tzdybal commented Jan 5, 2022

Just for clarification: I totally agree, that raw keys can be very useful, I understand the logic behind this change. But IMHO it's a low hanging fruit, to be able to use SMT with both raw (advanced use, turned on by option) and hashed keys (by default).
This way, library can be more general-purpose, more developer/user friendly.

@rach-id
Copy link
Member Author

rach-id commented Jan 5, 2022

@tzdybal sounds good to have both, yes. I can start digging into it as soon as I get the green light 👍

@adlerjohn
Copy link
Member

Having the option for both modes isn't bad IMO. Having the option for raw keys is a must, beyond that extra features are gravy on top.

@rach-id
Copy link
Member Author

rach-id commented Jan 5, 2022

@adlerjohn should I get going with the implementation ? in // with the explorer stuff ?
Which one should be the default behaviour ? raw keys or hashed keys ?

Also, what do you think about these ?

Questions:

  1. Should the ErrWrongKeySize be also a struct ? So the error message also contains the wrong key. Or no need for that ?
  2. Should we update the fuzz test or we no longer need it ? since it will always be a constant size key... What about generating tons of constant sized strings ?
  3. I think that the tree hasher should no longer be passed to these structures, as it is only hashing everything internally... What can be done is using a default hasher that can be changed using options, for example, but no longer be necessary.

@rach-id
Copy link
Member Author

rach-id commented Jan 5, 2022

Stopping work on this for now. Will pick it up later.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use raw key as path instead of hash(key)
4 participants