-
Notifications
You must be signed in to change notification settings - Fork 801
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce binary size by compressing genesis.ssz
#4564
Comments
Credit to @dapplion for suggesting this in a DM. |
I'd like to work on this. I can start by benchmarking the time it takes to decompress the mainnet/prater genesis state and post the results here |
I have a repo here: https://github.com/eserilev/snappy-genesis-benchmark that compresses/decompresses On my machine the time it took to decompress
file sizes for the compressed and decompressed
Snappy compression seems to reduce file size by ~50%, I measured elapsed time using
EDIT: using the release flag when running compression/decompression resulted in far faster times (in the millisecond range) |
Very interesting @eserilev, thanks! I'm tempted to go ahead with this. The 1.5s mainnet delay seems reasonable for mainnet. The ~10s delay for a Prater node is a bit heavy, but perhaps not a big deal considering it's a testnet. I'll raise this with some others before making a call. Thanks again! |
Thinking about this some more, I think there's a few options:
We could probably achieve (3) by just detecting the presence of a I'm tempted to go with (3) since I'm not really sure that shrinking the binary by ~3.6MB (~3%) is worth adding a 1-2s startup delay to the VC for mainnet. Reducing VC startup delays is good because it reduces the downtime penalty for upgrades; I like users to feel uninhibited to update regularly. On the other hand, I can see the value in a 10-20MB (~10-20%) reduction by compressing testnet binaries. The startup delay is much less of a concern there. I'm presently in favour of (3), but I'll raise this internally to get some feedback. |
I get very different results on my machine, which makes me wonder if @eserilev's disk is severely limiting his benchmark:
This is on an M1 Macbook Pro (2021). |
After some more research, I've come to the following conclusions:
If my first two points turn out to be correct (this is something that would be determined during implementation), then I am fine to just compress all states. Especially, if Michael's timings turn out to be closer to reality for most users. On another note, I've noticed that |
Thanks for taking another look at this Michael, glad to hear its running faster on other machines. I'm on a relatively beefy 2021 M1 max, so I wonder what could be limiting my compression/decompression times this drastically. Thanks for the additional write up Paul, I think I have a good starting point to begin working here. |
There's a "lower power mode" (you can Spotlight search that phrase) which can reduce compute speeds. I'd be surprised if it were to make that much of a difference though.. |
@eserilev Did you run the benchmark with release optimisations? Like |
Ah! that was the issue. With the release flag these are my results:
|
Heya guys, I really like the idea of compressing the genesis states. The genesis state for holesky will be >190MB uncompressed. |
We used to pull genesis states from Github, however we had users having trouble accessing Github (IIRC it was primarily users in China). That's why we started including states in the binary. I haven't done the numbers on Holesky, but if it will be >190MB uncompressed then we might need to consider going back to downloading genesis states at startup. To address the issues with Github access, we could:
With that approach we could instruct users to supply an alternate |
would we be hosting the compressed genesis files ? it could reduce download times at start up compared to downloading uncompressed genesis |
This can be a nice initiative to extend checkpointz, should not be too difficult since that infra can already serve states, they just need to expose another one |
Yep, that sounds like a good idea to me! |
FYI we're expecting to release v4.4.0 on/around the 31st of August. The primary goal of v4.4.0 is to add support for I think that this issue (state compression and downloading) is going to be critical for that release. @eserilev I'm happy for you to take this issue if you'd like to (you've done lots of great work for LH), but I'd like to give you the option to pass if you're not comfortable with the time pressure. Could you please let me know if you'd still like to tackle this issue? No pressure either way |
@paulhauner no problem, ill get a PR up for review shortly |
Genesis state now exists and it is 198MB. |
I'm pushing this to the next release since we have #4653 which adds Holesky. |
## Issue Addressed NA ## Proposed Changes Add the Holesky network config as per https://github.com/eth-clients/holesky/tree/36e4ff2d5138dcb2eb614f0f60fdb060b2adc1e2/custom_config_data. Since the genesis state is ~190MB, I've opted to *not* include it in the binary and instead download it at runtime (see #4564 for context). To download this file we have: - A hard-coded URL for a SigP-hosted S3 bucket with the Holesky genesis state. Assuming this download works correctly, users will be none the wiser that the state wasn't included in the binary (apart from some additional logs) - If the user provides a `--checkpoint-sync-url` flag, then LH will download the genesis state from that server rather than our S3 bucket. - If the user provides a `--genesis-state-url` flag, then LH will download the genesis state from that server regardless of the S3 bucket or `--checkpoint-sync-url` flag. - Whenever a genesis state is downloaded it is checked against a checksum baked into the binary. - A genesis state will never be downloaded if it's already included in the binary. - There is a `--genesis-state-url-timeout` flag to tweak the timeout for downloading the genesis state file. ## Log Output Example of log output when a state is downloaded: ```bash Aug 23 05:40:13.424 INFO Logging to file path: "/Users/paul/.lighthouse/holesky/beacon/logs/beacon.log" Aug 23 05:40:13.425 INFO Lighthouse started version: Lighthouse/v4.3.0-bd9931f+ Aug 23 05:40:13.425 INFO Configured for network name: holesky Aug 23 05:40:13.426 INFO Data directory initialised datadir: /Users/paul/.lighthouse/holesky Aug 23 05:40:13.427 INFO Deposit contract address: 0x4242424242424242424242424242424242424242, deploy_block: 0 Aug 23 05:40:13.427 INFO Downloading genesis state info: this may take some time on testnets with large validator counts, timeout: 60s, server: https://sigp-public-genesis-states.s3.ap-southeast-2.amazonaws.com/ Aug 23 05:40:29.895 INFO Starting from known genesis state service: beacon ``` Example of log output when there are no URLs specified: ``` Aug 23 06:29:51.645 INFO Logging to file path: "/Users/paul/.lighthouse/goerli/beacon/logs/beacon.log" Aug 23 06:29:51.646 INFO Lighthouse started version: Lighthouse/v4.3.0-666a39c+ Aug 23 06:29:51.646 INFO Configured for network name: goerli Aug 23 06:29:51.647 INFO Data directory initialised datadir: /Users/paul/.lighthouse/goerli Aug 23 06:29:51.647 INFO Deposit contract address: 0xff50ed3d0ec03ac01d4c79aad74928bff48a7b2b, deploy_block: 4367322 The genesis state is not present in the binary and there are no known download URLs. Please use --checkpoint-sync-url or --genesis-state-url. ``` ## Additional Info I tested the `--genesis-state-url` flag with all 9 Goerli checkpoint sync servers on https://eth-clients.github.io/checkpoint-sync-endpoints/ and they all worked 🎉 My IDE eagerly formatted some `Cargo.toml`. I've disabled it but I don't see the value in spending time reverting the changes that are already there. I also added the `GenesisStateBytes` enum to avoid an unnecessary clone on the genesis state bytes baked into the binary. This is not a huge deal on Mainnet, but will become more relevant when testing with big genesis states. When we do a fresh checkpoint sync we're downloading the genesis state to check the `genesis_validators_root` against the finalised state we receive. This is not *entirely* pointless, since we verify the checksum when we download the genesis state so we are actually guaranteeing that the finalised state is on the same network. There might be a smarter/less-download-y way to go about this, but I've run out of cycles to figure that out. Perhaps we can grab it in the next release?
## Issue Addressed NA ## Proposed Changes Add the Holesky network config as per https://github.com/eth-clients/holesky/tree/36e4ff2d5138dcb2eb614f0f60fdb060b2adc1e2/custom_config_data. Since the genesis state is ~190MB, I've opted to *not* include it in the binary and instead download it at runtime (see sigp#4564 for context). To download this file we have: - A hard-coded URL for a SigP-hosted S3 bucket with the Holesky genesis state. Assuming this download works correctly, users will be none the wiser that the state wasn't included in the binary (apart from some additional logs) - If the user provides a `--checkpoint-sync-url` flag, then LH will download the genesis state from that server rather than our S3 bucket. - If the user provides a `--genesis-state-url` flag, then LH will download the genesis state from that server regardless of the S3 bucket or `--checkpoint-sync-url` flag. - Whenever a genesis state is downloaded it is checked against a checksum baked into the binary. - A genesis state will never be downloaded if it's already included in the binary. - There is a `--genesis-state-url-timeout` flag to tweak the timeout for downloading the genesis state file. ## Log Output Example of log output when a state is downloaded: ```bash Aug 23 05:40:13.424 INFO Logging to file path: "/Users/paul/.lighthouse/holesky/beacon/logs/beacon.log" Aug 23 05:40:13.425 INFO Lighthouse started version: Lighthouse/v4.3.0-bd9931f+ Aug 23 05:40:13.425 INFO Configured for network name: holesky Aug 23 05:40:13.426 INFO Data directory initialised datadir: /Users/paul/.lighthouse/holesky Aug 23 05:40:13.427 INFO Deposit contract address: 0x4242424242424242424242424242424242424242, deploy_block: 0 Aug 23 05:40:13.427 INFO Downloading genesis state info: this may take some time on testnets with large validator counts, timeout: 60s, server: https://sigp-public-genesis-states.s3.ap-southeast-2.amazonaws.com/ Aug 23 05:40:29.895 INFO Starting from known genesis state service: beacon ``` Example of log output when there are no URLs specified: ``` Aug 23 06:29:51.645 INFO Logging to file path: "/Users/paul/.lighthouse/goerli/beacon/logs/beacon.log" Aug 23 06:29:51.646 INFO Lighthouse started version: Lighthouse/v4.3.0-666a39c+ Aug 23 06:29:51.646 INFO Configured for network name: goerli Aug 23 06:29:51.647 INFO Data directory initialised datadir: /Users/paul/.lighthouse/goerli Aug 23 06:29:51.647 INFO Deposit contract address: 0xff50ed3d0ec03ac01d4c79aad74928bff48a7b2b, deploy_block: 4367322 The genesis state is not present in the binary and there are no known download URLs. Please use --checkpoint-sync-url or --genesis-state-url. ``` ## Additional Info I tested the `--genesis-state-url` flag with all 9 Goerli checkpoint sync servers on https://eth-clients.github.io/checkpoint-sync-endpoints/ and they all worked 🎉 My IDE eagerly formatted some `Cargo.toml`. I've disabled it but I don't see the value in spending time reverting the changes that are already there. I also added the `GenesisStateBytes` enum to avoid an unnecessary clone on the genesis state bytes baked into the binary. This is not a huge deal on Mainnet, but will become more relevant when testing with big genesis states. When we do a fresh checkpoint sync we're downloading the genesis state to check the `genesis_validators_root` against the finalised state we receive. This is not *entirely* pointless, since we verify the checksum when we download the genesis state so we are actually guaranteeing that the finalised state is on the same network. There might be a smarter/less-download-y way to go about this, but I've run out of cycles to figure that out. Perhaps we can grab it in the next release?
NA Add the Holesky network config as per https://github.com/eth-clients/holesky/tree/36e4ff2d5138dcb2eb614f0f60fdb060b2adc1e2/custom_config_data. Since the genesis state is ~190MB, I've opted to *not* include it in the binary and instead download it at runtime (see sigp#4564 for context). To download this file we have: - A hard-coded URL for a SigP-hosted S3 bucket with the Holesky genesis state. Assuming this download works correctly, users will be none the wiser that the state wasn't included in the binary (apart from some additional logs) - If the user provides a `--checkpoint-sync-url` flag, then LH will download the genesis state from that server rather than our S3 bucket. - If the user provides a `--genesis-state-url` flag, then LH will download the genesis state from that server regardless of the S3 bucket or `--checkpoint-sync-url` flag. - Whenever a genesis state is downloaded it is checked against a checksum baked into the binary. - A genesis state will never be downloaded if it's already included in the binary. - There is a `--genesis-state-url-timeout` flag to tweak the timeout for downloading the genesis state file. Example of log output when a state is downloaded: ```bash Aug 23 05:40:13.424 INFO Logging to file path: "/Users/paul/.lighthouse/holesky/beacon/logs/beacon.log" Aug 23 05:40:13.425 INFO Lighthouse started version: Lighthouse/v4.3.0-bd9931f+ Aug 23 05:40:13.425 INFO Configured for network name: holesky Aug 23 05:40:13.426 INFO Data directory initialised datadir: /Users/paul/.lighthouse/holesky Aug 23 05:40:13.427 INFO Deposit contract address: 0x4242424242424242424242424242424242424242, deploy_block: 0 Aug 23 05:40:13.427 INFO Downloading genesis state info: this may take some time on testnets with large validator counts, timeout: 60s, server: https://sigp-public-genesis-states.s3.ap-southeast-2.amazonaws.com/ Aug 23 05:40:29.895 INFO Starting from known genesis state service: beacon ``` Example of log output when there are no URLs specified: ``` Aug 23 06:29:51.645 INFO Logging to file path: "/Users/paul/.lighthouse/goerli/beacon/logs/beacon.log" Aug 23 06:29:51.646 INFO Lighthouse started version: Lighthouse/v4.3.0-666a39c+ Aug 23 06:29:51.646 INFO Configured for network name: goerli Aug 23 06:29:51.647 INFO Data directory initialised datadir: /Users/paul/.lighthouse/goerli Aug 23 06:29:51.647 INFO Deposit contract address: 0xff50ed3d0ec03ac01d4c79aad74928bff48a7b2b, deploy_block: 4367322 The genesis state is not present in the binary and there are no known download URLs. Please use --checkpoint-sync-url or --genesis-state-url. ``` I tested the `--genesis-state-url` flag with all 9 Goerli checkpoint sync servers on https://eth-clients.github.io/checkpoint-sync-endpoints/ and they all worked 🎉 My IDE eagerly formatted some `Cargo.toml`. I've disabled it but I don't see the value in spending time reverting the changes that are already there. I also added the `GenesisStateBytes` enum to avoid an unnecessary clone on the genesis state bytes baked into the binary. This is not a huge deal on Mainnet, but will become more relevant when testing with big genesis states. When we do a fresh checkpoint sync we're downloading the genesis state to check the `genesis_validators_root` against the finalised state we receive. This is not *entirely* pointless, since we verify the checksum when we download the genesis state so we are actually guaranteeing that the finalised state is on the same network. There might be a smarter/less-download-y way to go about this, but I've run out of cycles to figure that out. Perhaps we can grab it in the next release?
NA Add the Holesky network config as per https://github.com/eth-clients/holesky/tree/36e4ff2d5138dcb2eb614f0f60fdb060b2adc1e2/custom_config_data. Since the genesis state is ~190MB, I've opted to *not* include it in the binary and instead download it at runtime (see sigp#4564 for context). To download this file we have: - A hard-coded URL for a SigP-hosted S3 bucket with the Holesky genesis state. Assuming this download works correctly, users will be none the wiser that the state wasn't included in the binary (apart from some additional logs) - If the user provides a `--checkpoint-sync-url` flag, then LH will download the genesis state from that server rather than our S3 bucket. - If the user provides a `--genesis-state-url` flag, then LH will download the genesis state from that server regardless of the S3 bucket or `--checkpoint-sync-url` flag. - Whenever a genesis state is downloaded it is checked against a checksum baked into the binary. - A genesis state will never be downloaded if it's already included in the binary. - There is a `--genesis-state-url-timeout` flag to tweak the timeout for downloading the genesis state file. Example of log output when a state is downloaded: ```bash Aug 23 05:40:13.424 INFO Logging to file path: "/Users/paul/.lighthouse/holesky/beacon/logs/beacon.log" Aug 23 05:40:13.425 INFO Lighthouse started version: Lighthouse/v4.3.0-bd9931f+ Aug 23 05:40:13.425 INFO Configured for network name: holesky Aug 23 05:40:13.426 INFO Data directory initialised datadir: /Users/paul/.lighthouse/holesky Aug 23 05:40:13.427 INFO Deposit contract address: 0x4242424242424242424242424242424242424242, deploy_block: 0 Aug 23 05:40:13.427 INFO Downloading genesis state info: this may take some time on testnets with large validator counts, timeout: 60s, server: https://sigp-public-genesis-states.s3.ap-southeast-2.amazonaws.com/ Aug 23 05:40:29.895 INFO Starting from known genesis state service: beacon ``` Example of log output when there are no URLs specified: ``` Aug 23 06:29:51.645 INFO Logging to file path: "/Users/paul/.lighthouse/goerli/beacon/logs/beacon.log" Aug 23 06:29:51.646 INFO Lighthouse started version: Lighthouse/v4.3.0-666a39c+ Aug 23 06:29:51.646 INFO Configured for network name: goerli Aug 23 06:29:51.647 INFO Data directory initialised datadir: /Users/paul/.lighthouse/goerli Aug 23 06:29:51.647 INFO Deposit contract address: 0xff50ed3d0ec03ac01d4c79aad74928bff48a7b2b, deploy_block: 4367322 The genesis state is not present in the binary and there are no known download URLs. Please use --checkpoint-sync-url or --genesis-state-url. ``` I tested the `--genesis-state-url` flag with all 9 Goerli checkpoint sync servers on https://eth-clients.github.io/checkpoint-sync-endpoints/ and they all worked 🎉 My IDE eagerly formatted some `Cargo.toml`. I've disabled it but I don't see the value in spending time reverting the changes that are already there. I also added the `GenesisStateBytes` enum to avoid an unnecessary clone on the genesis state bytes baked into the binary. This is not a huge deal on Mainnet, but will become more relevant when testing with big genesis states. When we do a fresh checkpoint sync we're downloading the genesis state to check the `genesis_validators_root` against the finalised state we receive. This is not *entirely* pointless, since we verify the checksum when we download the genesis state so we are actually guaranteeing that the finalised state is on the same network. There might be a smarter/less-download-y way to go about this, but I've run out of cycles to figure that out. Perhaps we can grab it in the next release?
Description
We presently include the uncompressed genesis state for supported networks in our binary. We have several of these files:
The total of these files is 57.4M, which goes straight to our
hipsbinary size (presently ~110M). I suspect we could significantly reduce the size of the binaries by storing compressedgenesis.ssz
bytes in the binary and then decompressing on-demand (i.e. at startup).I propose that we use snappy compression, since it's used by the P2P layer and therefore available in the binary.
Before committing to this change, I would be keen to know the time it takes to decompress the state at startup. Perhaps getting numbers for mainnet and Prater would be good. We want to be careful not to slow-down BN/VC startup.
Details
The method for including the
genesis.ssz
can be a bit tricky to understand because it's written in macros. I think this should be fairly straight-forward once you get your head across it. I've included some links below to give a lay of the land.The bytes are added to the binary here:
lighthouse/common/eth2_config/src/lib.rs
Lines 129 to 137 in dfcb336
The
genesis.ssz
file used by theinclude_bytes!
macro is generated here (this is where we'd want to do the snappy compression):lighthouse/common/eth2_network_config/build.rs
Lines 30 to 47 in dfcb336
The application accesses the included bytes here (this is where we'd want to do the snappy decompression) (there might also be other places it is accessed):
lighthouse/common/eth2_network_config/src/lib.rs
Lines 98 to 108 in dfcb336
The text was updated successfully, but these errors were encountered: