-
Notifications
You must be signed in to change notification settings - Fork 548
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't generate genesis proof at startup #8764
Conversation
how does this interact with the |
That flag will still crash the daemon if it's passed as |
I vote "yes". |
Done :) |
The instructions for running a node will need to reflect this change, when it's released. |
I've deprecated the flag rather than removing it, so it's now a no-op, in the interest of not breaking existing setups on upgrade. We should still update the docs, but I don't think it's a necessity. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
Asked a question but otherwise looks great.
in | ||
if | ||
Consensus.Data.Consensus_state.is_genesis_state | ||
consensus_state |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we also want to check if the network is in genesis epoch and ignore otherwise? A node when restarted with clean config directory will have genesis state as its best tip. Might be better to not produce a block at all in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A node when restarted with clean config directory will have genesis state as its best tip. Might be better to not produce a block at all in this case.
I agree, but I think this falls outside of the scope of this PR. We should probably avoid checking vrfs/running the start
function at all until we have an epoch ledger from a block that's within the current epoch recent, which would fix a lot of the reported bp weirdness during startup.
Currently, this code will only trigger if they've won a slot that's imminent (this or the next) but haven't been running for long enough to see their best tip progress. If it's this slot, they would produce a useless block on top of the genesis block anyway (waiting to produce it may even allow them time to get a plausible best tip); if it's the next, they'll waste some compute generating a genesis proof that currently every node is wasting at startup regardless.
@mrmr1993 is there any risk that this could crash folks' nodes right before they make their block? What are the failure scenarios of generating the genesis proof? |
@bkase the risk is very low:
The failure scenarios are, to my mind,
This also reduces memory usage in the daemon process by not loading block proving keys, so we'll see less memory pressure and thus fewer memory corruption crashes, which appear to trigger at GC time. This also reduces cold-start time from 100s to 8s (based on CI logs), so crash recovery may be significantly faster. |
Just to clarify some notes here: only debian users are re-generating the proof (and downloading keys) on startup, currently the baked docker images cache these value. More importantly though, I'm concerned that users will not have the keys ready in under ~6 minutes when they need them. Depending on network conditions the s3 download can be really flakey/problematic, I would like to at least keep shipping the keys in docker via some mechanism but it doesnt have to be this --generate-genesis-proof flag or the baking setup. Is the standalone key-downloader tool in a usable state @mrmr1993 ? |
@lk86 It's usable, but not very pretty or user friendly. Should we be thinking about a |
it doesn't have to be pretty or user friendly to replace the functionality in the baked dockerfile and in CI, so that's good enough for me, we can integrate it nicely with the debian package later given that debian users are already dealing with the s3 download logic. |
This PR
n
slots in the future, the genesis proof will be produced at slotn-1
, if it is still neededChecklist: