Add rehashing support for inclusion + consistency proofs. #305

Martin2112 · 2017-01-19T10:07:40Z

This allows us to serve proofs at arbitrary tree sizes. Snapshot recomputation annotates portions of the path that require rehashing and this is then used to recalculate a subtree node when storage returns the nodes. Add some proof tests at the node / path calculation level to improve coverage. If proofs are at STH corresponding sizes the new code paths are not taken so probably safe!

Sorry this PR is a bit big but it was proving tricky to separate out.

codingllama

I've still got to have a better look at
"server/proof_fetcher_test.go" (just glanced into it, haven't really looked into the test tables or TestTreeX tests).

The other files I think I've reviewed properly, though.

codingllama · 2017-01-20T13:01:07Z

merkle/merkle_path.go

-		return []NodeFetch{}, fmt.Errorf("invalid params ts: %d index: %d, bitlen:%d", treeSize, index, maxBitLen)
+func CalcInclusionProofNodeAddresses(snapshot, index, treeSize int64, maxBitLen int) ([]NodeFetch, error) {
+	if snapshot > treeSize || index >= snapshot || index < 0 || snapshot < 1 || maxBitLen <= 0 {
+		return []NodeFetch{}, fmt.Errorf("invalid params s: %d index: %d ts: %d, bitlen:%d", snapshot, index, treeSize, maxBitLen)


Suggestion: s/[]NodeFetch{}/nil

Ditto for others occurrences.

For consistency with my previous nits. :)

codingllama · 2017-01-20T13:03:49Z

merkle/merkle_path.go

-		return []NodeFetch{}, fmt.Errorf("invalid params prior: %d treesize: %d, bitlen:%d", previousTreeSize, treeSize, maxBitLen)
+func CalcConsistencyProofNodeAddresses(snapshot1, snapshot2, treeSize int64, maxBitLen int) ([]NodeFetch, error) {
+	if snapshot1 > snapshot2 || snapshot1 > treeSize || snapshot2 > treeSize || snapshot1 < 1 || snapshot2 < 1 || maxBitLen <= 0 {
+		return []NodeFetch{}, fmt.Errorf("invalid params s1: %d s2: %d tss: %d, bitlen:%d", snapshot1, snapshot2, treeSize, maxBitLen)


nit: s/tss/ts/

codingllama · 2017-01-20T15:18:56Z

merkle/merkle_path.go

+// tree state at the snapshot size differs from the size we've stored it at. The calculations
+// also need to take into account missing levels, see the tree diagrams in this file.
+// If called with snapshot equal to the tree size returns empty. Otherwise, assuming no errors,
+// the output of this should always be exactly one node. Either a copy of one of the nodes in


"the output of this should always be exactly one node". I think it should say "at least one node".

After rehashing it's one node. Added a clarification.

codingllama · 2017-01-20T15:20:27Z

merkle/merkle_path.go

+		// Nothing to do
+		return []NodeFetch{}, nil
+	} else if snapshot > treeSize {
+		return fetches, fmt.Errorf("recomputePastSnapshot: %d does not exist for tree of size %d", snapshot, treeSize)


nit: return nil, fmt.Errorf("...")

Yes, "fetches" is empty here, but it seems a bit strange to return a named variable together with an error.

codingllama · 2017-01-20T15:22:33Z

merkle/merkle_path.go

+	// This is the index of the last node that actually exists in the underlying tree
+	lastNodeAtLevel := treeSize - 1
+
+	// Work up towards, the root we may find the node we need without needing to rehash if


nit: "Work up towards, the root we may find..." sounds a bit strange. Should it be: "Work up towards the root. We may find ... "?

Yeah typo fixed.

codingllama · 2017-01-20T16:49:11Z

server/proof_fetcher.go

+	return trillian.Proof{LeafIndex: leafIndex, ProofNode: r.proof}, r.proofError
+}
+
+// dedupAndFetchNodes() removes duplicates from the set of fetches and then passes the result to


At first I understood that it removes duplicates and returns a "deduped" slice, which would make the index matching at line 27 break. We should probably mention that it dedups the query, but returns duplicates in the resulting slice.

codingllama · 2017-01-20T16:51:59Z

server/proof_fetcher_test.go

+var rehashTests = []rehashTest{
+	{
+		desc:    "no rehash",
+		index:   int64(126),


I'm struggling to see how the index matches the leaves / tree size. Could you add a comment to clarify? (Ditto for others.)

Maybe I should take a break :)

It's just testing that the right value gets copied in. May not be actually required as I can't remember why it's there.

If we could remove them it would be better. They definitely confused me. (Still do tbh.)

codingllama · 2017-01-20T16:56:24Z

server/proof_fetcher_test.go

+
+func TestRehasher(t *testing.T) {
+	for _, rehashTest := range rehashTests {
+		r := newRehasher()


nit: I would prefer to run the tests against fetchNodesAndBuildProof() instead of directly using the rehasher. We're reaching inside private functions anyway, but the boundary of proof_fetcher.go seems to be fetchNodesAndBuildProof.

Ditto for TestDedupFetcher / dedupAndFetchNodes below.

The rehasher is notionally separate API and I did consider pulling it out but it's not a massive amount of code. Most tests do call fetchNodesAndBuildProof. The dedup stuff has gone now.

codingllama · 2017-01-20T16:58:25Z

server/proof_fetcher_test.go

+		nodes, err := dedupAndFetchNodes(tx, 37, dedupTest.input)
+
+		if err == nil && dedupTest.storageError != nil {
+			t.Fatalf("%s: got nil, want error: %v", dedupTest.desc, err)


t.Errorf and continue? Or maybe do the checks inside a switch? It doesn't seem like a mismatch here should stop other test cases.

Ditto for the Fatal below.

Wasn't sure because it's just going to dump hex strings that don't match. I think printing a bunch of these is no extra help but willing to be convinced otherwise.

codingllama · 2017-01-20T17:02:30Z

server/proof_fetcher_test.go

+		if err != nil && dedupTest.storageError == nil {
+			t.Fatalf("%s: got error: %v, want nil", dedupTest.desc, err)
+		}
+


nit: If we get a (correct) error response we're still going ahead and checking got != want, but instead we should check that the errors match and stop.

This test was removed with the dedup code.

codingllama · 2017-01-23T18:11:05Z

merkle/merkle_path.go

+// are valid. There must be at least one fetch. All fetches must have the same rehash state and if
+// there is only one fetch then it must not be a rehash. If all checks pass then the fetches
+// represent one node after rehashing is completed.
+func checkRecomputation(fetches []NodeFetch) error {


Fair enough. SGTM.

codingllama · 2017-01-23T18:27:54Z

server/proof_fetcher.go

+		nodes = append(nodes, node)
+	}
+
+	for i, node := range nodes {


It's a bit strange. Having a package check its own logic is one thing, assuming the storage implementation might break in some random manner is another. If we can't trust queries by ID we're in a rough place.

If you feel this adds a real benefit, please push back. I wanted to try to make my case once more, but I'll stop now.

codingllama · 2017-01-23T18:30:34Z

server/proof_fetcher_test.go

+var n2n3n4 = &trillian.Node{NodeHash: th.HashChildren(h4, th.HashChildren(h3, h2))}
+var n4n5 = &trillian.Node{NodeHash: th.HashChildren(h5, h4)}
+
+var rehashTests = []rehashTest{


(still a nit) IMO, the smaller the scope the better. If those matter to a single test, having it inside the test both tells me that and moves them closer to where they're used.

codingllama · 2017-01-23T18:33:43Z

server/proof_fetcher_test.go

+var rehashTests = []rehashTest{
+	{
+		desc:    "no rehash",
+		index:   int64(126),


If we could remove them it would be better. They definitely confused me. (Still do tbh.)

codingllama · 2017-01-23T18:41:56Z

server/proof_fetcher_test.go

+	for ts := 2; ts <= 32; ts++ {
+		mt := treeAtSize(ts)
+		r := testonly.NewMultiFakeNodeReaderFromLeaves([]testonly.LeafBatch{
+			{TreeRevision: 3, Leaves: expandLeaves(0, ts-1), ExpectedRoot: expectedRootAtSize(mt)},


nit: Extract "3" to a treeRevision variable and use it below, so it doesn't look like a magic number at other lines.

codingllama · 2017-01-23T18:45:06Z

server/proof_fetcher_test.go

+
+		for s := 2; s <= ts; s++ {
+			for l := 0; l < s; l++ {
+				fetches, err := merkle.CalcInclusionProofNodeAddresses(int64(s), int64(l), int64(ts), 64)


nit: Extract maxBitLen := 64

codingllama · 2017-01-23T18:48:53Z

server/proof_fetcher_test.go

+		})
+
+		for s := 2; s <= ts; s++ {
+			for l := 0; l < s; l++ {


nit: I realize Go has this thing with short variable names, but I would much prefer "s" to be "snapshot", "l" to be "leaf", "r" to be "nodeReader" and so on. It's a bit hard to parse the CalcInclusionProof call below without jumping around a bit.

codingllama · 2017-01-23T18:50:32Z

server/proof_fetcher_test.go

+
+	for s := 2; s <= 32; s++ {
+		for l := 0; l < s; l++ {
+			fetches, err := merkle.CalcInclusionProofNodeAddresses(int64(s), int64(l), 32, 64)


nit: Same readability comments as above (longer names, extracting magic numbers to named vars).

codingllama · 2017-01-23T18:51:41Z

server/proof_fetcher_test.go

+
+		for s1 := 2; s1 < ts; s1++ {
+			for s2 := s1 + 1; s2 < ts; s2++ {
+				fetches, err := merkle.CalcConsistencyProofNodeAddresses(int64(s1), int64(s2), int64(ts), 64)


nit: As above (readability)

(This one is actually easier to parse than the others.)

codingllama · 2017-01-23T18:53:51Z

server/proof_fetcher_test.go

+	if err != nil {
+		panic(err)
+	}
+


nit: Remove vertical whitespace?

(Same for the following functions.)

Done. Still think this looks ugly.

codingllama · 2017-01-23T19:16:27Z

Please address the remaining comments, but feel free to merge afterwards. They're mostly nits anyway.

Martin2112 · 2017-01-27T10:55:30Z

Rebased to pick up recent changes before starting rework.

codingllama

Approved (barring merge conflicts)

As storage API has changed.

Remove expectations of consistency proof version fetches that no longer happen.

Fix issues masked by integration tests not reporting errors when this code was being developed.

googlebot added the cla: yes label Jan 19, 2017

Martin2112 force-pushed the add_rehashing branch from f81a0cd to 19a877a Compare January 19, 2017 10:11

Martin2112 added this to the Log M6 milestone Jan 19, 2017

Martin2112 force-pushed the add_rehashing branch from c7e149d to badad88 Compare January 19, 2017 16:36

Martin2112 requested a review from codingllama January 20, 2017 09:08

codingllama reviewed Jan 20, 2017

View reviewed changes

codingllama approved these changes Jan 23, 2017

View reviewed changes

codingllama self-assigned this Jan 24, 2017

Martin2112 force-pushed the add_rehashing branch from e88a01a to 6450708 Compare January 27, 2017 10:55

codingllama approved these changes Jan 27, 2017

View reviewed changes

Martin2112 added 12 commits January 30, 2017 12:15

Add all the rehashing related changes from the proof branch

3c73067

Update fake node reader api

e44a94c

As storage API has changed.

Fix tests

3d96cf8

Remove expectations of consistency proof version fetches that no longer happen.

gofmt

c236486

Fix imports and improve doc comments

db80793

Review fixes

050e4f2

Update tests for merkle API switch to int64

2cb950d

Review fixes.

10ed3c5

gofmt

54e55fd

Pull in integration test update branch fixes

dc4a5f2

Fix issues masked by integration tests not reporting errors when this code was being developed.

gofmt again

610c38a

Fix for use of context in tx api upstream

feee108

Martin2112 force-pushed the add_rehashing branch from 1608b91 to feee108 Compare January 30, 2017 12:17

Remove obsolete TODO

6a46464

Martin2112 merged commit 21fa859 into google:master Jan 30, 2017

Martin2112 deleted the add_rehashing branch January 30, 2017 14:30

This was referenced Jan 30, 2017

Better unit tests of inclusion / consistency proofs #132

Closed

Support proofs at arbitrary tree sizes #285

Closed

pav-kv mentioned this pull request Apr 8, 2022

Allow not storing ephemeral node hashes #2568

Merged

2 tasks

Add rehashing support for inclusion + consistency proofs. #305

Add rehashing support for inclusion + consistency proofs. #305

Conversation

Martin2112 commented Jan 19, 2017

codingllama left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codingllama commented Jan 23, 2017

Martin2112 commented Jan 27, 2017

codingllama left a comment

Choose a reason for hiding this comment