-
Notifications
You must be signed in to change notification settings - Fork 4.6k
Adds --no-skip-initial-accounts-db-clean
*hidden* CLI flag
#33664
Adds --no-skip-initial-accounts-db-clean
*hidden* CLI flag
#33664
Conversation
@t-nelson Do you want to weigh in here on the CLI parts? Naming/etc. If there are desired changes, it's nice to iterate on those quickly, so not to waste a bunch of time waiting on CI. I'm planning on backporting this to v1.17 and v1.16. |
Codecov Report
@@ Coverage Diff @@
## master #33664 +/- ##
=========================================
- Coverage 81.8% 81.8% -0.1%
=========================================
Files 806 806
Lines 217477 217484 +7
=========================================
- Hits 178026 177995 -31
- Misses 39451 39489 +38 |
i kinda like that nearly all of our current "suspect" flags match |
Please no double negatives 🥺 |
dunno that i'd call "skip" a negative. it's omitting work, ie optimization, ie faster, ie better, ie positive, if anything! i mainly want a "oh one of these operators" signal in the args list. great short circuit when doing support |
Ok, sounds good. Done in 6e20193. |
--accounts-db-force-initial-clean
*hidden* CLI flag--no-skip-initial-accounts-db-clean
*hidden* CLI flag
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
@t-nelson Do you want to re-review before I merge? |
(cherry picked from commit 452fd5d) # Conflicts: # runtime/src/bank/serde_snapshot.rs # runtime/src/snapshot_bank_utils.rs
(cherry picked from commit 452fd5d)
thanks! |
…backport of #33664) (#33676) * Adds `--no-skip-initial-accounts-db-clean` *hidden* CLI flag (#33664) (cherry picked from commit 452fd5d) # Conflicts: # runtime/src/bank/serde_snapshot.rs # runtime/src/snapshot_bank_utils.rs * fix backport conflicts/issues --------- Co-authored-by: Brooks <[email protected]>
…backport of #33664) (#33677) Adds `--no-skip-initial-accounts-db-clean` *hidden* CLI flag (#33664) (cherry picked from commit 452fd5d) Co-authored-by: Brooks <[email protected]>
Problem
Some nodes with 128 GB of RAM are OOMing when upgrading to v1.16.
The issue happens at startup, while the first
clean
runs. Since the firstclean
takes a while on these nodes,flush
cannot run, and thus the accounts write cache balloons until OOM. The accounts write cache fills up because we are processing transactions.In older version of the validator, we used to always
clean
before processing transactions. If we had that behavior again, then the OOM could be avoided.Discord debug channel for more info: https://discord.com/channels/428295358100013066/1156647974563233963
Summary of Changes
Add a hidden CLI flag that forces
clean
to run before we begin processing transactions.