Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: merklize into copied pages #652

Merged
merged 1 commit into from
Jan 2, 2025
Merged

Conversation

rphmeier
Copy link
Contributor

@rphmeier rphmeier commented Dec 23, 2024

There are two reasons to do this:

  1. Perf. We copy the pages anyway during sync and it's currently a single-threaded bottleneck. Doing it during commit where we have more threads would be best.
  2. Overlays. Overlays can't mutate pages in place, so they need to be copied.

I suspect that this will actually be faster than master for commit-concurrency > 1. However, I will also benchmark for commit-concurrency = 1 (note: I remove the copies during writeout only further upstack, so the benchmarks will be there)

This seems to obviate the read/write pass stuff, so if perf holds up we should likely just remove those in follow-up PRs.

Copy link
Contributor Author

rphmeier commented Dec 23, 2024

@rphmeier rphmeier force-pushed the rh-copy-pages-merkle branch from 43b0bad to e2190ba Compare December 30, 2024 18:12
@rphmeier rphmeier changed the title [WIP] merklize into copied pages merklize into copied pages Dec 30, 2024
@rphmeier rphmeier force-pushed the rh-copy-pages-merkle branch from e2190ba to 4c6a8b1 Compare December 30, 2024 21:12
TriePosition::new(),
vec![
(key_path![0, 1, 0, 1, 1, 0], val(1)),
(key_path![0, 1, 0, 1, 1, 1], val(2)),
],
);

match walker.conclude() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this test was actually silently broken in master for a couple reasons.

  1. it broke the contract of the page walker, because it called advance_and_replace with a position that was a sub-position of a previous advance_and_replace call.
  2. it tested that there were 3 modified pages, however, in the previous version of the test there were only two pages that were being modified (root + 1 child page). the third page came from a broken invariant causing build_stack to reload a page which was already pushed to the updated_pages vector.

the changes in this PR exposed the problems with this test so I fixed them here.

.map_or(false, |d| d.page_id == child_page_id)
{
// UNWRAP: just checked
self.updated_pages.pop().unwrap()
Copy link
Contributor Author

@rphmeier rphmeier Dec 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fairly certain this is never reachable after the removal of leaf children. The reason this case might've existed before was:

  1. we begin to replace a leaf at the bottom of a page with 2 leaves at the layer below.
  2. its leaf children are on the beginning of the next page, so we remove them (page gets pushed onto updated_pages)
  3. we then traverse down into that page and need to get it from updated_pages.

However, now we don't have this case. We never delete a page during compaction and then re-enter that page. That's because the parent node of the page must be an internal node, and advance only is done on terminal nodes (leaf/terminator)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also agree with that and with the reason why it was previously required. This could be proven by the fact that down() is only called in replace_terminal(..). If it satisfies the fact that all new elements of a subtree are provided in one batch, then there is no way to do what you just explained

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we put an unreachable! guard here then?

Copy link
Contributor

@gabriele-0201 gabriele-0201 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not needing the read-write pass anymore increases a lot code readability, maintaining clear separations between which worker will update which pages!

.map_or(false, |d| d.page_id == child_page_id)
{
// UNWRAP: just checked
self.updated_pages.pop().unwrap()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also agree with that and with the reason why it was previously required. This could be proven by the fact that down() is only called in replace_terminal(..). If it satisfies the fact that all new elements of a subtree are provided in one batch, then there is no way to do what you just explained

Copy link
Contributor

pepyakin commented Jan 2, 2025

Merge activity

  • Jan 2, 7:23 AM EST: A user started a stack merge that includes this pull request via Graphite.
  • Jan 2, 7:25 AM EST: Graphite rebased this pull request as part of a merge.
  • Jan 2, 7:26 AM EST: A user merged this pull request with Graphite.

@pepyakin pepyakin changed the base branch from rh-deep-clone-fatpage to graphite-base/652 January 2, 2025 12:23
@pepyakin pepyakin changed the base branch from graphite-base/652 to master January 2, 2025 12:23
@pepyakin pepyakin force-pushed the rh-copy-pages-merkle branch from b10f9d8 to a4dd447 Compare January 2, 2025 12:24
@pepyakin pepyakin merged commit a7dbd67 into master Jan 2, 2025
8 checks passed
@pepyakin pepyakin deleted the rh-copy-pages-merkle branch January 2, 2025 12:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants