Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add adr-001 for node key refactoring #608

Merged
merged 29 commits into from
Feb 21, 2023
Merged

feat: add adr-001 for node key refactoring #608

merged 29 commits into from
Feb 21, 2023

Conversation

cool-develope
Copy link
Collaborator

ref: #592, #593, #597

@cool-develope cool-develope requested a review from a team as a code owner November 1, 2022 11:08
@tac0turtle
Copy link
Member

a good question by celestia, how does this effect the commitment structure? Are proofs still the same?

@cool-develope
Copy link
Collaborator Author

a good question by celestia, how does this effect the commitment structure? Are proofs still the same?

it doesn't affect, as you can see in #597 it doesn't update any proof part and still passing tests

@tzdybal
Copy link

tzdybal commented Nov 1, 2022

a good question by celestia, how does this effect the commitment structure? Are proofs still the same?

it doesn't affect, as you can see in #597 it doesn't update any proof part and still passing tests

#597 still contains version field, which is still hashed (see:

iavl/node.go

Line 364 in 36a150e

err = encoding.EncodeVarint(w, node.version)
). Removing version will affect all the hashes, and therefore migration will invalidate all existing proofs.

@cool-develope
Copy link
Collaborator Author

cool-develope commented Nov 1, 2022

Removing version will affect all the hashes, and therefore migration will invalidate all existing proofs.

@tzdybal , good catch. @tac0turtle how about your thougt? Will we keep the version or refactor the ics23 part?

@tac0turtle
Copy link
Member

@tzdybal , good catch. @tac0turtle how about your thougt? Will we keep the version or refactor the ics23 part?

what do you mean by refactor ics23?

So there are two paths, introduce this in a breaking release of the sdk which a coordinated upgraded is needed or leave it alone. for testing purposes and data collection we probably want to leave it but in the future we could break it. BUT we should check with IBC about this.

@cool-develope
Copy link
Collaborator Author

what do you mean by refactor ics23?

I mean update the validation part in ics23.

So there are two paths, introduce this in a breaking release of the sdk which a coordinated upgraded is needed or leave it alone. for testing purposes and data collection we probably want to leave it but in the future we could break it. BUT we should check with IBC about this.

Sounds good, then let's keep the version at this time. I will create the issue and we can revisit in the future

docs/architecture/adr-001-node-key-refactoring.md Outdated Show resolved Hide resolved
docs/architecture/adr-001-node-key-refactoring.md Outdated Show resolved Hide resolved
docs/architecture/adr-001-node-key-refactoring.md Outdated Show resolved Hide resolved
Comment on lines 34 to 35
leftNodeKey int64 // new field, need to store in the storage
rightNodeKey int64 // new field, need to store in the storage
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need these fields, if we already have leftNode and rightNode?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leftNode and rightNode is only meaningful on memory side, we need these fields to get children from the storage side.

Copy link
Member

@aaronc aaronc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unclear as to how an integer improves data locality. Can you explain this a bit more? It would seem that it puts nodes that are created at similar times near each other in the B-tree, but this doesn't actually improve things like range scans other than by reducing the size of the node key which is significant.

@cool-develope
Copy link
Collaborator Author

cool-develope commented Nov 1, 2022

I'm unclear as to how an integer improves data locality. Can you explain this a bit more? It would seem that it puts nodes that are created at similar times near each other in the B-tree, but this doesn't actually improve things like range scans other than by reducing the size of the node key which is significant.

Reducing key size itself improves data locality. I am not sure if leveldb uses compressed trie or general bTree, but we can reduce the node count and improve the tree density.

@yihuang
Copy link
Collaborator

yihuang commented Dec 12, 2022

A different nonce assignment strategy can be useful in for example migration, we can do a sequential iteration on existing db, should be much faster than doing random access to find the nodes version by version

The migration would be complicated, it requires a more detail design. I don't think we can keep the current import/export structure in the new design.

https://github.com/cosmos/iavl/blob/master/export.go#L21
The current export node is already nonce agnostic, it just rely on a post-order traversal of the tree, I think we can keep that, so the nonce assignment strategy can be a node local decision.

The `Update` operation will require extra DB access because we need to take children to calculate the hash of updated nodes.
It doesn't require more access in other cases including `Set`, `Remove`, and `Proof`.

It is impossible to remove the individual version. The new design requires more restrict pruning strategies.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you elaborate on this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is enough, it will remove orphans and it will be impossible to remove the intermediate version from storage.
And pruning part explains in more detail the new methods.

Copy link
Collaborator

@yihuang yihuang Jan 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if removing orphans, it's certainly possible to delete the individual versions in between.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets edit to include this.

@yihuang
Copy link
Collaborator

yihuang commented Jan 4, 2023

Another round of review:

Keep order of insertion sorted

when there are multiple stores in the same db, the insertion is not sorted by keys anymore.

Migration

You also need to record the mapping of hash -> new node key to update the references in node body.

Pruning

We'll need both in-order and pre-order somehow, in-order to keep the key order, pre-order to skip subtree early on.

Remove the individual version

It's still possible to remove individual version with new pruning alrogithm, we just need to be aware of predecessor version and avoid deleting nodes older than that.

Export Snapshot

Can we clarify that the new node key format don't affect the snapshot export format? so the nonce assignment strategy is not part of consensus, but a node local decision.


### Negative

The `Update` operation will require extra DB access because we need to take children to calculate the hash of updated nodes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean to say that it requires extra DB reads? Why is this not needed in Set and Remove?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because update only needs the re-calc of hash, but Set and Remove requires calcHeightAndSize (re-calculation of height and size) and this needs to splay the children nodes

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so we are getting left and right in Set and Remove but not splay in Update since we have leftHash and rightHash in the original version

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the trade off here is storage saving on per node basis, right? maybe we mention this as a tradeoff

@tac0turtle tac0turtle requested a review from yihuang February 21, 2023 14:09
@tac0turtle tac0turtle merged commit c61d1de into master Feb 21, 2023
@tac0turtle tac0turtle deleted the 592/adr branch February 21, 2023 22:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants