-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[perf] Use Vec
instead of BTreeMap
in tree_from_memory
#1338
Conversation
This comment has been minimized.
This comment has been minimized.
This reverts commit 3ee49b3.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
@zlangley I couldn't think of a simple way to avoid collecting items |
This comment has been minimized.
This comment has been minimized.
for ((address_space, pointer), value) in memory.items() { | ||
if pointer as usize / CHUNK >= (1 << memory_dimensions.address_height) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't this mean something bad happened at run-time and the proof will fail? Can't we remove this and just leave the assert below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will happen when memory size is not divisible by page size, which, if both of these are powers of two, is equivalent to memory size being less than page size. We can assert that this is not the case and modify the tests correspondingly, in principle
// representing the entire memory tree. | ||
let mut memory_partition = BTreeMap::new(); | ||
let mut memory_partition: Vec<(u64, [F; CHUNK])> = Vec::new(); | ||
for ((address_space, pointer), value) in memory.items() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: It's a bit unfortunate that MemoryImage
is (address_space, pointer) -> F
rather than (address_space, label) -> [F; 8]
. I feel like this means we have to do a lot of extra work here and at the end of a segment to convert it into both forms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, added some future ideas
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
4345385
to
0b41e45
Compare
0b41e45
to
090d703
Compare
This comment has been minimized.
This comment has been minimized.
This reverts commit 98ae517.
This comment has been minimized.
This comment has been minimized.
This reverts commit e247f5b.
Commit: 1aa0c56 |
} | ||
debug_assert!(memory_partition.is_sorted_by_key(|(addr, _)| addr)); | ||
let memory_items = memory | ||
.items() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change items()
to use par_iter
now!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually maybe it's too trivial, let's try separately of this PR
Metric: tracegen
Before: 10,901.18 ms
After: 10,131.55 ms
Percentage gain: -7.1%
Workflow: https://github.com/axiom-crypto/openvm-reth-benchmark/actions/runs/13160394734
This resolves INT-3236.
Quote from the ticket.
Our current
AddressMap.items()
returns something sorted by address (more specifically by(address_space, pointer)
lexicographically), therefore no need to push it into aBTreeMap
, we can push it intoVec
and then modify the recursion a little bit.We can also save on creating the "all-zero" nodes by building them once and then taking the corresponding node every time we need such node for some height. This should in theory always be strictly better, but this is worse now [...].
End of quote.
I also tried to calculate the capacity once and then reserve the exact capacity needed once instead of
Vec::new()
and.push()
every time, but this was slightly slower or at least the same.