-
Notifications
You must be signed in to change notification settings - Fork 381
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add rehashing support for inclusion + consistency proofs. #305
Conversation
f81a0cd
to
19a877a
Compare
c7e149d
to
badad88
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've still got to have a better look at
"server/proof_fetcher_test.go" (just glanced into it, haven't really looked into the test tables or TestTreeX tests).
The other files I think I've reviewed properly, though.
return []NodeFetch{}, fmt.Errorf("invalid params ts: %d index: %d, bitlen:%d", treeSize, index, maxBitLen) | ||
func CalcInclusionProofNodeAddresses(snapshot, index, treeSize int64, maxBitLen int) ([]NodeFetch, error) { | ||
if snapshot > treeSize || index >= snapshot || index < 0 || snapshot < 1 || maxBitLen <= 0 { | ||
return []NodeFetch{}, fmt.Errorf("invalid params s: %d index: %d ts: %d, bitlen:%d", snapshot, index, treeSize, maxBitLen) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: s/[]NodeFetch{}/nil
Ditto for others occurrences.
For consistency with my previous nits. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
return []NodeFetch{}, fmt.Errorf("invalid params prior: %d treesize: %d, bitlen:%d", previousTreeSize, treeSize, maxBitLen) | ||
func CalcConsistencyProofNodeAddresses(snapshot1, snapshot2, treeSize int64, maxBitLen int) ([]NodeFetch, error) { | ||
if snapshot1 > snapshot2 || snapshot1 > treeSize || snapshot2 > treeSize || snapshot1 < 1 || snapshot2 < 1 || maxBitLen <= 0 { | ||
return []NodeFetch{}, fmt.Errorf("invalid params s1: %d s2: %d tss: %d, bitlen:%d", snapshot1, snapshot2, treeSize, maxBitLen) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: s/tss/ts/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
// tree state at the snapshot size differs from the size we've stored it at. The calculations | ||
// also need to take into account missing levels, see the tree diagrams in this file. | ||
// If called with snapshot equal to the tree size returns empty. Otherwise, assuming no errors, | ||
// the output of this should always be exactly one node. Either a copy of one of the nodes in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"the output of this should always be exactly one node". I think it should say "at least one node".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After rehashing it's one node. Added a clarification.
// Nothing to do | ||
return []NodeFetch{}, nil | ||
} else if snapshot > treeSize { | ||
return fetches, fmt.Errorf("recomputePastSnapshot: %d does not exist for tree of size %d", snapshot, treeSize) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: return nil, fmt.Errorf("...")
Yes, "fetches" is empty here, but it seems a bit strange to return a named variable together with an error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
// This is the index of the last node that actually exists in the underlying tree | ||
lastNodeAtLevel := treeSize - 1 | ||
|
||
// Work up towards, the root we may find the node we need without needing to rehash if |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: "Work up towards, the root we may find..." sounds a bit strange. Should it be: "Work up towards the root. We may find ... "?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah typo fixed.
return trillian.Proof{LeafIndex: leafIndex, ProofNode: r.proof}, r.proofError | ||
} | ||
|
||
// dedupAndFetchNodes() removes duplicates from the set of fetches and then passes the result to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At first I understood that it removes duplicates and returns a "deduped" slice, which would make the index matching at line 27 break. We should probably mention that it dedups the query, but returns duplicates in the resulting slice.
var rehashTests = []rehashTest{ | ||
{ | ||
desc: "no rehash", | ||
index: int64(126), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm struggling to see how the index matches the leaves / tree size. Could you add a comment to clarify? (Ditto for others.)
Maybe I should take a break :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's just testing that the right value gets copied in. May not be actually required as I can't remember why it's there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we could remove them it would be better. They definitely confused me. (Still do tbh.)
|
||
func TestRehasher(t *testing.T) { | ||
for _, rehashTest := range rehashTests { | ||
r := newRehasher() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I would prefer to run the tests against fetchNodesAndBuildProof() instead of directly using the rehasher. We're reaching inside private functions anyway, but the boundary of proof_fetcher.go seems to be fetchNodesAndBuildProof.
Ditto for TestDedupFetcher / dedupAndFetchNodes below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rehasher is notionally separate API and I did consider pulling it out but it's not a massive amount of code. Most tests do call fetchNodesAndBuildProof. The dedup stuff has gone now.
nodes, err := dedupAndFetchNodes(tx, 37, dedupTest.input) | ||
|
||
if err == nil && dedupTest.storageError != nil { | ||
t.Fatalf("%s: got nil, want error: %v", dedupTest.desc, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
t.Errorf and continue? Or maybe do the checks inside a switch? It doesn't seem like a mismatch here should stop other test cases.
Ditto for the Fatal below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wasn't sure because it's just going to dump hex strings that don't match. I think printing a bunch of these is no extra help but willing to be convinced otherwise.
if err != nil && dedupTest.storageError == nil { | ||
t.Fatalf("%s: got error: %v, want nil", dedupTest.desc, err) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: If we get a (correct) error response we're still going ahead and checking got != want, but instead we should check that the errors match and stop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test was removed with the dedup code.
// are valid. There must be at least one fetch. All fetches must have the same rehash state and if | ||
// there is only one fetch then it must not be a rehash. If all checks pass then the fetches | ||
// represent one node after rehashing is completed. | ||
func checkRecomputation(fetches []NodeFetch) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough. SGTM.
nodes = append(nodes, node) | ||
} | ||
|
||
for i, node := range nodes { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bit strange. Having a package check its own logic is one thing, assuming the storage implementation might break in some random manner is another. If we can't trust queries by ID we're in a rough place.
If you feel this adds a real benefit, please push back. I wanted to try to make my case once more, but I'll stop now.
var n2n3n4 = &trillian.Node{NodeHash: th.HashChildren(h4, th.HashChildren(h3, h2))} | ||
var n4n5 = &trillian.Node{NodeHash: th.HashChildren(h5, h4)} | ||
|
||
var rehashTests = []rehashTest{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(still a nit) IMO, the smaller the scope the better. If those matter to a single test, having it inside the test both tells me that and moves them closer to where they're used.
var rehashTests = []rehashTest{ | ||
{ | ||
desc: "no rehash", | ||
index: int64(126), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we could remove them it would be better. They definitely confused me. (Still do tbh.)
for ts := 2; ts <= 32; ts++ { | ||
mt := treeAtSize(ts) | ||
r := testonly.NewMultiFakeNodeReaderFromLeaves([]testonly.LeafBatch{ | ||
{TreeRevision: 3, Leaves: expandLeaves(0, ts-1), ExpectedRoot: expectedRootAtSize(mt)}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Extract "3" to a treeRevision variable and use it below, so it doesn't look like a magic number at other lines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|
||
for s := 2; s <= ts; s++ { | ||
for l := 0; l < s; l++ { | ||
fetches, err := merkle.CalcInclusionProofNodeAddresses(int64(s), int64(l), int64(ts), 64) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Extract maxBitLen := 64
}) | ||
|
||
for s := 2; s <= ts; s++ { | ||
for l := 0; l < s; l++ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I realize Go has this thing with short variable names, but I would much prefer "s" to be "snapshot", "l" to be "leaf", "r" to be "nodeReader" and so on. It's a bit hard to parse the CalcInclusionProof call below without jumping around a bit.
|
||
for s := 2; s <= 32; s++ { | ||
for l := 0; l < s; l++ { | ||
fetches, err := merkle.CalcInclusionProofNodeAddresses(int64(s), int64(l), 32, 64) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Same readability comments as above (longer names, extracting magic numbers to named vars).
|
||
for s1 := 2; s1 < ts; s1++ { | ||
for s2 := s1 + 1; s2 < ts; s2++ { | ||
fetches, err := merkle.CalcConsistencyProofNodeAddresses(int64(s1), int64(s2), int64(ts), 64) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: As above (readability)
(This one is actually easier to parse than the others.)
if err != nil { | ||
panic(err) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Remove vertical whitespace?
(Same for the following functions.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Still think this looks ugly.
Please address the remaining comments, but feel free to merge afterwards. They're mostly nits anyway. |
e88a01a
to
6450708
Compare
Rebased to pick up recent changes before starting rework. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved (barring merge conflicts)
As storage API has changed.
Fix issues masked by integration tests not reporting errors when this code was being developed.
1608b91
to
feee108
Compare
This allows us to serve proofs at arbitrary tree sizes. Snapshot recomputation annotates portions of the path that require rehashing and this is then used to recalculate a subtree node when storage returns the nodes. Add some proof tests at the node / path calculation level to improve coverage. If proofs are at STH corresponding sizes the new code paths are not taken so probably safe!
Sorry this PR is a bit big but it was proving tricky to separate out.