-
Notifications
You must be signed in to change notification settings - Fork 547
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
optimize memory consumption for verifier subprocess #16201
optimize memory consumption for verifier subprocess #16201
Conversation
@@ -960,12 +960,30 @@ module For_tests = struct | |||
let consensus_constants = precomputed_values.consensus_constants | |||
end | |||
|
|||
module T = Transaction_snark.Make (struct |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although it is located in the For_unit_tests
module, I suspect it can be initialized even by the production code if it's somehow referenced by some function somewhere.
I'm not sure if making the module is an expensive operation, but I'd suggest:
- Making it singleton for usage in unit tests (this would also reduce the code duplication)
- Measure the time to initialize it, and if it's non-negligble, surround with some laziness
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously initialization would happen from within a separate process, only when it's launched. Now it will happen if the module that contains this is somehow referenced by some code that is invoked in initialization, and it's possible that this is executed every time.
Once again, I'm not sure if making a functor is expensive itself or it is cheap, just generating a bunch of lazy values.
src/lib/verifier/prod.ml
Outdated
Lazy.force B.Proof.verification_key ) | ||
|
||
let get_blockchain_verification_key () = Lazy.force vk | ||
let get_blockchain_verification_key () = Deferred.return verification_key |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we could just remove this method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh yes, forgot to do it
opam.export
Outdated
@@ -21,6 +21,7 @@ roots: [ | |||
"js_of_ocaml-toplevel.4.0.0" | |||
"lmdb.1.0" | |||
"menhir.20210419" | |||
"memtrace_viewer.0.16.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this dependency used somehow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only locally for memory trace. i will remove it
src/lib/verifier/for_test.ml
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still unsure if this is purely for tests purposes. Maybe we should also reuse part of this method in prover ? then there will be a single place where we create BlockhainSnark and TransactionSnark modules. Prover bootstrap code is anyhow executed in mina main thread.
@@ -56,6 +56,9 @@ module Worker_state = struct | |||
val toggle_internal_tracing : bool -> unit | |||
|
|||
val set_itn_logger_data : daemon_port:int -> unit | |||
|
|||
val get_blockchain_verification_key : |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder also what to do in Check | No_check modes. Where we return dummy keys. Maybe this method should return Pickles.Verification_key.t Deferred.t
but then in Mina_grahpql modules we are exposing Verifiaction key from verifier so we would need to either break the interface and also there wrap repsone in optional or handle returning diummy as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's update graphql module to use key that is returned from prover. It's okay if it's dummy in case of non-full modes.
It's non-ideal that we create a dummy prover that returns dummy keys. I'd prefer it to be refactored to not initialize prover unless it's full mode. But let's save this idea for one of next PRs, not this one.
dd5ef45
to
081b26d
Compare
081b26d
to
5a5884d
Compare
!ci-build-me |
!ci-nightly-me |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still unsure if this is purely for tests purposes. Maybe we should also reuse part of this method in prover ? then there will be a single place where we create BlockhainSnark and TransactionSnark modules. Prover bootstrap code is anyhow executed in mina main thread.
!ci-build-me |
!ci-nightly-me |
src/lib/verifier/prod.ml
Outdated
; commit_id : string | ||
; verification_key : Pickles.Verification_key.Stable.Latest.t |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please rename to blockchain_verification_key
here and elsewhere.
src/lib/verifier/prod.ml
Outdated
match Or_error.try_with (fun () -> T.verify ts) with | ||
match | ||
Or_error.try_with (fun () -> | ||
Transaction_snark.verify ts ~key:verification_key ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is incorrect. When we verify a transaction snark proof, we need to use the transaction snark verification key, not the blockchain snark verification key. You'll need to pipe that through.
!ci-build-me |
!ci-build-me |
!ci-nightly-me |
…tion_on_bootstrap' into dkijania/optimize_memory_consumption_on_bootstrap
@@ -0,0 +1,165 @@ | |||
import collections | |||
import os | |||
import re |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's introduce this script as a separate PR, it's not directly related to this PR
src/lib/verifier/for_test.mli
Outdated
@@ -0,0 +1,12 @@ | |||
val get_verification_keys : |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's remove get_verification_keys
from for_test.mli
, it's never used
src/lib/verifier/prod.ml
Outdated
@@ -191,7 +189,7 @@ module Worker_state = struct | |||
result | |||
|
|||
let get_blockchain_verification_key () = | |||
Lazy.force B.Proof.verification_key | |||
Deferred.return blockchain_verification_key |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's remove get_blockchain_verification_key
from verifier and its interface, it's never used
src/lib/verifier/dummy.ml
Outdated
let get_blockchain_verification_key { verification_key; _ } = | ||
Deferred.Or_error.try_with ~here:[%here] (fun () -> | ||
Lazy.force verification_key ) | ||
let get_blockchain_verification_key { blockchain_verification_key; _ } = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's remove this
@@ -56,6 +56,9 @@ module Worker_state = struct | |||
val toggle_internal_tracing : bool -> unit | |||
|
|||
val set_itn_logger_data : daemon_port:int -> unit | |||
|
|||
val get_blockchain_verification_key : |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's update graphql module to use key that is returned from prover. It's okay if it's dummy in case of non-full modes.
It's non-ideal that we create a dummy prover that returns dummy keys. I'd prefer it to be refactored to not initialize prover unless it's full mode. But let's save this idea for one of next PRs, not this one.
Co-authored-by: George Agapov <[email protected]>
…tion_on_bootstrap' into dkijania/optimize_memory_consumption_on_bootstrap
!ci-build-me |
!ci-build-me |
!ci-build-me |
!ci-nightly-me |
!ci-build-me |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving with the last suggestion to simplify code
src/lib/verifier/for_test.mli
Outdated
-> proof_level:Genesis_constants.Proof_level.t | ||
-> ( [ `Blockchain of Pickles.Verification_key.t ] | ||
* [ `Transaction of Pickles.Verification_key.t ] ) | ||
Async.Deferred.t |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for nit-picking last moment, but only now I noticed Async.Deferred.t
at the end which is completely unnecessary here (we just return at the end).
Also, let's add a helper function for the following code:
let ( `Blockchain blockchain_verification_key
, `Transaction transaction_verification_key ) =
Verifier.For_test.get_verification_keys_eagerly ~constraint_constants
~proof_level
in
Verifier.create ~logger ~proof_level ~conf_dir:None
~pids:(Child_processes.Termination.create_pid_table ())
~commit_id:"not specified for unit tests" ()
~blockchain_verification_key ~transaction_verification_key )
The code is repeated many times, and helper would make the code more readable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! will do
src/lib/verifier/for_test.ml
Outdated
let open Async.Deferred.Let_syntax in | ||
let%bind blockchain_vk = Lazy.force B.Proof.verification_key in | ||
let%bind transaction_vk = Lazy.force T.verification_key in | ||
return (`Blockchain blockchain_vk, `Transaction transaction_vk) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return (`Blockchain blockchain_vk, `Transaction transaction_vk) | |
(`Blockchain blockchain_vk, `Transaction transaction_vk) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, compiler is not happy when i removed return.
Matthew's concern was covered in the follow-up commits
!ci-build-me |
!ci-nightly-me |
Implementation of https://www.notion.so/o1labs/RAM-savvy-Mina-node-114e79b1f91080968c47e815124a6279 regarding verifier/prover key duplication. Removed duplicated key generation in verifier process and reusing prover verification key (through GRPC) in verifier. Steps are pretty simple:
Testing:
Apart from full suite of integration tests i did tests locally (also in order to generate memory usage comparison before and after changed). For this purpose I used single node (seed without bp keys). Bootstrapped it and observed memory usage for mina and both sub-processes (verifier, prover). Before my changes prover and verifier were using proving keys in a lazy way. Currently prover works more eagerly, that's why mina takes up to 2 GB when bootstrapping, then GC frees memory leaving on only:
To compare previously memory usage distribution was similar to:
Before first usage of prover/verifier cryptographic material
Also what is worth mentioning that due to passing verification key from prover to verifier we gained ~ 15 seconds in bootstrap speed. In order to reach ~1 second we need to refactor :
And bake those computation into rust binary. As Matthew mentioned , we can compute it at compile time and completely remove 10s form verifier boatload leaving only 1 sec.
Another test was to use local mina network in order to simulate small network method usage. I used network of:
Below some raw data of memory usage:
metrics_new.csv
metrics_compatible.csv
And some graphs:
Above graph show that verifiers processes are taking up to 0.9 GB after they are forced to be evaluated. Mina processes are taking around ~250 to 300 Mb of residential memory size. Only two provers (for mina whales) are taking up to 2.0 GB as they are asked to init calculation for produced blocks.
Laser focus on verifiers:
After optimization:
Here we can observe quite big spike of memory for seed (up to 1.7 GB) as it was first node to calculate proof data. Provers are acting quite the same as before change for active block producers. Inactive bp (seeds,fishes,passive nodes are taking up to 0.9 as they eagerly loaded proof data). This behavior could be tweak as well and as a result we could skip launching provers for non-block producers (seeds, archive node feeders) save ~ 1 GB of memory
The major change is that more main and verifier processes are taking around 250 MB.
More detailed look at verifiers processes:
And finally verifiers memory comparison: