-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: merklize into copied pages #652
Conversation
This stack of pull requests is managed by Graphite. Learn more about stacking. |
43b0bad
to
e2190ba
Compare
e2190ba
to
4c6a8b1
Compare
TriePosition::new(), | ||
vec![ | ||
(key_path![0, 1, 0, 1, 1, 0], val(1)), | ||
(key_path![0, 1, 0, 1, 1, 1], val(2)), | ||
], | ||
); | ||
|
||
match walker.conclude() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this test was actually silently broken in master for a couple reasons.
- it broke the contract of the page walker, because it called
advance_and_replace
with a position that was a sub-position of a previousadvance_and_replace
call. - it tested that there were 3 modified pages, however, in the previous version of the test there were only two pages that were being modified (root + 1 child page). the third page came from a broken invariant causing
build_stack
to reload a page which was already pushed to theupdated_pages
vector.
the changes in this PR exposed the problems with this test so I fixed them here.
.map_or(false, |d| d.page_id == child_page_id) | ||
{ | ||
// UNWRAP: just checked | ||
self.updated_pages.pop().unwrap() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am fairly certain this is never reachable after the removal of leaf children. The reason this case might've existed before was:
- we begin to replace a leaf at the bottom of a page with 2 leaves at the layer below.
- its leaf children are on the beginning of the next page, so we remove them (page gets pushed onto
updated_pages
) - we then traverse
down
into that page and need to get it fromupdated_pages
.
However, now we don't have this case. We never delete a page during compaction and then re-enter that page. That's because the parent node of the page must be an internal node, and advance
only is done on terminal nodes (leaf/terminator)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also agree with that and with the reason why it was previously required. This could be proven by the fact that down()
is only called in replace_terminal(..)
. If it satisfies the fact that all new elements of a subtree are provided in one batch, then there is no way to do what you just explained
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we put an unreachable!
guard here then?
4b9fb4c
to
1e5d7f3
Compare
4c6a8b1
to
b10f9d8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not needing the read-write pass anymore increases a lot code readability, maintaining clear separations between which worker will update which pages!
.map_or(false, |d| d.page_id == child_page_id) | ||
{ | ||
// UNWRAP: just checked | ||
self.updated_pages.pop().unwrap() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also agree with that and with the reason why it was previously required. This could be proven by the fact that down()
is only called in replace_terminal(..)
. If it satisfies the fact that all new elements of a subtree are provided in one batch, then there is no way to do what you just explained
b10f9d8
to
a4dd447
Compare
There are two reasons to do this:
commit
where we have more threads would be best.I suspect that this will actually be faster than
master
for commit-concurrency > 1. However, I will also benchmark for commit-concurrency = 1 (note: I remove the copies during writeout only further upstack, so the benchmarks will be there)This seems to obviate the read/write pass stuff, so if perf holds up we should likely just remove those in follow-up PRs.