refactor: merklize into copied pages #652

rphmeier · 2024-12-23T19:29:40Z

There are two reasons to do this:

Perf. We copy the pages anyway during sync and it's currently a single-threaded bottleneck. Doing it during commit where we have more threads would be best.
Overlays. Overlays can't mutate pages in place, so they need to be copied.

I suspect that this will actually be faster than master for commit-concurrency > 1. However, I will also benchmark for commit-concurrency = 1 (note: I remove the copies during writeout only further upstack, so the benchmarks will be there)

This seems to obviate the read/write pass stuff, so if perf holds up we should likely just remove those in follow-up PRs.

rphmeier · 2024-12-23T19:29:58Z

opt: do not copy pages before bitbox writeout #655 : 2 dependent PRs (#648 , #656 )
refactor: add WriteArc io command #654
refactor: have page cache handle writing page ID rather than bitbox #653
refactor: merklize into copied pages #652 👈 (View in Graphite)
refactor: add deep Clone to FatPage #651
master

This stack of pull requests is managed by Graphite. Learn more about stacking.

rphmeier · 2024-12-30T21:15:30Z

nomt/src/merkle/page_walker.rs

            TriePosition::new(),
            vec![
                (key_path![0, 1, 0, 1, 1, 0], val(1)),
                (key_path![0, 1, 0, 1, 1, 1], val(2)),
            ],
        );

+        match walker.conclude() {


this test was actually silently broken in master for a couple reasons.

it broke the contract of the page walker, because it called advance_and_replace with a position that was a sub-position of a previous advance_and_replace call.

it tested that there were 3 modified pages, however, in the previous version of the test there were only two pages that were being modified (root + 1 child page). the third page came from a broken invariant causing build_stack to reload a page which was already pushed to the updated_pages vector.

the changes in this PR exposed the problems with this test so I fixed them here.

rphmeier · 2024-12-30T21:41:23Z

nomt/src/merkle/page_walker.rs

+                    .map_or(false, |d| d.page_id == child_page_id)
+                {
+                    // UNWRAP: just checked
+                    self.updated_pages.pop().unwrap()


I am fairly certain this is never reachable after the removal of leaf children. The reason this case might've existed before was:

we begin to replace a leaf at the bottom of a page with 2 leaves at the layer below.

its leaf children are on the beginning of the next page, so we remove them (page gets pushed onto updated_pages)

we then traverse down into that page and need to get it from updated_pages.

However, now we don't have this case. We never delete a page during compaction and then re-enter that page. That's because the parent node of the page must be an internal node, and advance only is done on terminal nodes (leaf/terminator)

I also agree with that and with the reason why it was previously required. This could be proven by the fact that down() is only called in replace_terminal(..). If it satisfies the fact that all new elements of a subtree are provided in one batch, then there is no way to do what you just explained

Should we put an unreachable! guard here then?

gabriele-0201

Not needing the read-write pass anymore increases a lot code readability, maintaining clear separations between which worker will update which pages!

gabriele-0201 · 2025-01-02T09:10:55Z

nomt/src/merkle/page_walker.rs

+                    .map_or(false, |d| d.page_id == child_page_id)
+                {
+                    // UNWRAP: just checked
+                    self.updated_pages.pop().unwrap()


I also agree with that and with the reason why it was previously required. This could be proven by the fact that down() is only called in replace_terminal(..). If it satisfies the fact that all new elements of a subtree are provided in one batch, then there is no way to do what you just explained

pepyakin · 2025-01-02T12:23:28Z

Merge activity

Jan 2, 7:23 AM EST: A user started a stack merge that includes this pull request via Graphite.
Jan 2, 7:25 AM EST: Graphite rebased this pull request as part of a merge.
Jan 2, 7:26 AM EST: A user merged this pull request with Graphite.

rphmeier mentioned this pull request Dec 23, 2024

refactor: add deep Clone to FatPage #651

Merged

rphmeier force-pushed the rh-copy-pages-merkle branch from 43b0bad to e2190ba Compare December 30, 2024 18:12

rphmeier changed the title ~~[WIP] merklize into copied pages~~ merklize into copied pages Dec 30, 2024

rphmeier force-pushed the rh-copy-pages-merkle branch from e2190ba to 4c6a8b1 Compare December 30, 2024 21:12

rphmeier commented Dec 30, 2024

View reviewed changes

rphmeier changed the title ~~merklize into copied pages~~ refactor: merklize into copied pages Dec 30, 2024

rphmeier force-pushed the rh-deep-clone-fatpage branch from 4b9fb4c to 1e5d7f3 Compare December 31, 2024 18:39

rphmeier force-pushed the rh-copy-pages-merkle branch from 4c6a8b1 to b10f9d8 Compare December 31, 2024 18:39

This was referenced Dec 31, 2024

feat: overlay type definitions and operations #648

Merged

feat: update* functions in NOMT API #649

Merged

refactor: separate update_inner from commit_inner #650

Closed

gabriele-0201 approved these changes Jan 2, 2025

View reviewed changes

pepyakin approved these changes Jan 2, 2025

View reviewed changes

pepyakin changed the base branch from rh-deep-clone-fatpage to graphite-base/652 January 2, 2025 12:23

pepyakin changed the base branch from graphite-base/652 to master January 2, 2025 12:23

[WIP] merklize into copied pages

a4dd447

pepyakin force-pushed the rh-copy-pages-merkle branch from b10f9d8 to a4dd447 Compare January 2, 2025 12:24

pepyakin merged commit a7dbd67 into master Jan 2, 2025
8 checks passed

pepyakin deleted the rh-copy-pages-merkle branch January 2, 2025 12:26

rphmeier mentioned this pull request Jan 13, 2025

refactor: split rollback delta creation and commit #685

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: merklize into copied pages #652

refactor: merklize into copied pages #652

rphmeier commented Dec 23, 2024 •

edited

Loading

rphmeier commented Dec 23, 2024 •

edited

Loading

rphmeier Dec 30, 2024

rphmeier Dec 30, 2024 •

edited

Loading

gabriele-0201 Jan 2, 2025

pepyakin Jan 2, 2025

gabriele-0201 left a comment

gabriele-0201 Jan 2, 2025

pepyakin commented Jan 2, 2025 •

edited

Loading

refactor: merklize into copied pages #652

refactor: merklize into copied pages #652

Conversation

rphmeier commented Dec 23, 2024 • edited Loading

rphmeier commented Dec 23, 2024 • edited Loading

rphmeier Dec 30, 2024

Choose a reason for hiding this comment

rphmeier Dec 30, 2024 • edited Loading

Choose a reason for hiding this comment

gabriele-0201 Jan 2, 2025

Choose a reason for hiding this comment

pepyakin Jan 2, 2025

Choose a reason for hiding this comment

gabriele-0201 left a comment

Choose a reason for hiding this comment

gabriele-0201 Jan 2, 2025

Choose a reason for hiding this comment

pepyakin commented Jan 2, 2025 • edited Loading

Merge activity

rphmeier commented Dec 23, 2024 •

edited

Loading

rphmeier commented Dec 23, 2024 •

edited

Loading

rphmeier Dec 30, 2024 •

edited

Loading

pepyakin commented Jan 2, 2025 •

edited

Loading