Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce binary size by compressing genesis.ssz #4564

Open
paulhauner opened this issue Aug 2, 2023 · 20 comments
Open

Reduce binary size by compressing genesis.ssz #4564

paulhauner opened this issue Aug 2, 2023 · 20 comments
Labels
optimization Something to make Lighthouse run more efficiently.

Comments

@paulhauner
Copy link
Member

paulhauner commented Aug 2, 2023

Description

We presently include the uncompressed genesis state for supported networks in our binary. We have several of these files:

[3.3M]  common/eth2_network_config/built_in_network_configs/chiado/genesis.ssz
[3.1M]  common/eth2_network_config/built_in_network_configs/gnosis/genesis.ssz
[5.2M]  common/eth2_network_config/built_in_network_configs/mainnet/genesis.ssz
[ 28M]  common/eth2_network_config/built_in_network_configs/prater/genesis.ssz
[ 15M]  common/eth2_network_config/built_in_network_configs/ropsten/genesis.ssz
[2.8M]  common/eth2_network_config/built_in_network_configs/sepolia/genesis.ssz

The total of these files is 57.4M, which goes straight to our hips binary size (presently ~110M). I suspect we could significantly reduce the size of the binaries by storing compressed genesis.ssz bytes in the binary and then decompressing on-demand (i.e. at startup).

I propose that we use snappy compression, since it's used by the P2P layer and therefore available in the binary.

Before committing to this change, I would be keen to know the time it takes to decompress the state at startup. Perhaps getting numbers for mainnet and Prater would be good. We want to be careful not to slow-down BN/VC startup.

Details

The method for including the genesis.ssz can be a bit tricky to understand because it's written in macros. I think this should be fairly straight-forward once you get your head across it. I've included some links below to give a lay of the land.

The bytes are added to the binary here:

include_bytes!(concat!(
$base_dir,
"/",
$this_crate::predefined_networks_dir!(),
"/",
$config_dir,
"/",
$filename
))

The genesis.ssz file used by the include_bytes! macro is generated here (this is where we'd want to do the snappy compression):

// Extract genesis state from genesis.ssz.zip
let archive_path = network.genesis_state_archive();
let archive_file = File::open(&archive_path)
.map_err(|e| format!("Failed to open archive file {:?}: {:?}", archive_path, e))?;
let mut archive =
ZipArchive::new(archive_file).map_err(|e| format!("Error with zip file: {}", e))?;
let mut file = archive.by_name(GENESIS_FILE_NAME).map_err(|e| {
format!(
"Error retrieving file {} inside zip: {}",
GENESIS_FILE_NAME, e
)
})?;
let mut outfile = File::create(&genesis_ssz_path)
.map_err(|e| format!("Error while creating file {:?}: {}", genesis_ssz_path, e))?;
io::copy(&mut file, &mut outfile)
.map_err(|e| format!("Error writing file {:?}: {}", genesis_ssz_path, e))?;

The application accesses the included bytes here (this is where we'd want to do the snappy decompression) (there might also be other places it is accessed):

/// Attempts to deserialize `self.beacon_state`, returning an error if it's missing or invalid.
pub fn beacon_state<E: EthSpec>(&self) -> Result<BeaconState<E>, String> {
let spec = self.chain_spec::<E>()?;
let genesis_state_bytes = self
.genesis_state_bytes
.as_ref()
.ok_or("Genesis state is unknown")?;
BeaconState::from_ssz_bytes(genesis_state_bytes, &spec)
.map_err(|e| format!("Genesis state SSZ bytes are invalid: {:?}", e))
}

@paulhauner paulhauner added the optimization Something to make Lighthouse run more efficiently. label Aug 2, 2023
@paulhauner
Copy link
Member Author

Credit to @dapplion for suggesting this in a DM.

@eserilev
Copy link
Collaborator

eserilev commented Aug 2, 2023

I'd like to work on this. I can start by benchmarking the time it takes to decompress the mainnet/prater genesis state and post the results here

@eserilev
Copy link
Collaborator

eserilev commented Aug 4, 2023

I have a repo here: https://github.com/eserilev/snappy-genesis-benchmark that compresses/decompresses genesis.ssz files for mainnet and prater. I left some notes in the README. To summarize:

On my machine the time it took to decompress genesis.ssz:

mainnet: 1.5s
prater: 9.8s

file sizes for the compressed and decompressed genesis.ssz

decompressed mainnet: 5.4M
compressed mainnet: 1.8M

decompressed prater: 29.8M
compressed prater: 18.1M

Snappy compression seems to reduce file size by ~50%, while increasing start-up time by potentially 10s of seconds.

I measured elapsed time using std::time::Instant::now(), which I think should be sufficient. We could do more elaborate benchmarking, but I think thats probably overkill

I think adding 10s of seconds to BN/VC start up time is a fair trade off for reducing ~25M in binary size. What do you think?

EDIT: using the release flag when running compression/decompression resulted in far faster times (in the millisecond range)

@paulhauner
Copy link
Member Author

Very interesting @eserilev, thanks!

I'm tempted to go ahead with this. The 1.5s mainnet delay seems reasonable for mainnet. The ~10s delay for a Prater node is a bit heavy, but perhaps not a big deal considering it's a testnet.

I'll raise this with some others before making a call. Thanks again!

@paulhauner
Copy link
Member Author

paulhauner commented Aug 7, 2023

Thinking about this some more, I think there's a few options:

  1. Don't compress any states (the status quo).
  2. Compress all states.
  3. Only compress some states.

We could probably achieve (3) by just detecting the presence of a genesis.ssz.snappy file on the filesystem.

I'm tempted to go with (3) since I'm not really sure that shrinking the binary by ~3.6MB (~3%) is worth adding a 1-2s startup delay to the VC for mainnet. Reducing VC startup delays is good because it reduces the downtime penalty for upgrades; I like users to feel uninhibited to update regularly.

On the other hand, I can see the value in a 10-20MB (~10-20%) reduction by compressing testnet binaries. The startup delay is much less of a concern there.

I'm presently in favour of (3), but I'll raise this internally to get some feedback.

@michaelsproul
Copy link
Member

I get very different results on my machine, which makes me wonder if @eserilev's disk is severely limiting his benchmark:

Time elapsed in compress_genesis_mainnet() is: 7.351209ms
Time elapsed in compress_genesis_prater() is: 35.996708ms
Time elapsed in decompress_genesis_mainnet() is: 4.52925ms
Time elapsed in decompress_genesis_prater() is: 29.038791ms

This is on an M1 Macbook Pro (2021).

@paulhauner
Copy link
Member Author

After some more research, I've come to the following conclusions:

  • I think the BN only needs to decompress the genesis state at first boot (i.e., when the DB is empty). In other words, we don't pay the decompression cost on a reboot.
  • I don't think the VC ever needs to decompress the genesis state.
  • If possible, I think we should make this Vec<u8> a &'static [u8]. Duplication of the state in memory is wasteful and could be up to 10% of VC memory usage.

If my first two points turn out to be correct (this is something that would be determined during implementation), then I am fine to just compress all states. Especially, if Michael's timings turn out to be closer to reality for most users.

On another note, I've noticed that Eth2NetworkConfig::beacon_state isn't the single, canonical place where we access the genesis_state_bytes. Rather, those bytes tend to be access directly and passed around the application. I'd be tempted to create a new-type wrapper around those (now compressed) bytes which provides functions for compression/decompression. That's up to the implementer, though ☺️

@eserilev
Copy link
Collaborator

eserilev commented Aug 7, 2023

Thanks for taking another look at this Michael, glad to hear its running faster on other machines. I'm on a relatively beefy 2021 M1 max, so I wonder what could be limiting my compression/decompression times this drastically.

Thanks for the additional write up Paul, I think I have a good starting point to begin working here.

@paulhauner
Copy link
Member Author

I wonder what could be limiting my compression/decompression times this drastically

There's a "lower power mode" (you can Spotlight search that phrase) which can reduce compute speeds. I'd be surprised if it were to make that much of a difference though..

@michaelsproul
Copy link
Member

@eserilev Did you run the benchmark with release optimisations? Like cargo run --release?

@eserilev
Copy link
Collaborator

eserilev commented Aug 7, 2023

@eserilev Did you run the benchmark with release optimisations? Like cargo run --release?

Ah! that was the issue. With the release flag these are my results:

Time elapsed in compress_genesis_mainnet() is: 5.509875ms
Time elapsed in compress_genesis_prater() is: 30.911709ms
Time elapsed in decompress_genesis_mainnet() is: 4.2995ms
Time elapsed in decompress_genesis_prater() is: 20.129416ms

@pk910
Copy link
Contributor

pk910 commented Aug 17, 2023

Heya guys,

I really like the idea of compressing the genesis states.
Did you already think about how to proceed with the holesky genesis?

The genesis state for holesky will be >190MB uncompressed.
Even with the compression that doesn't sound like it can be packed into the executable.
So given that, it might be reasonable to not pack testnet states into the executable at all, but load them from an external webserver/github/whatever?

@paulhauner
Copy link
Member Author

paulhauner commented Aug 17, 2023

it might be reasonable to not pack testnet states into the executable at all, but load them from an external webserver/github/whatever?

We used to pull genesis states from Github, however we had users having trouble accessing Github (IIRC it was primarily users in China). That's why we started including states in the binary.

I haven't done the numbers on Holesky, but if it will be >190MB uncompressed then we might need to consider going back to downloading genesis states at startup. To address the issues with Github access, we could:

  • Provide a Github (or whatever) URL by default (or perhaps a list of URLs).
  • Add a --genesis-state-url flag which can take an alternative URL.
  • Add the genesis state root to our binaries and verify that root whenever we download state.

With that approach we could instruct users to supply an alternate --genesis-state-url if the default approach is unreliable.

@paulhauner paulhauner added the v4.4.1 ETA August 2023 label Aug 17, 2023
@eserilev
Copy link
Collaborator

would we be hosting the compressed genesis files ? it could reduce download times at start up compared to downloading uncompressed genesis

@dapplion
Copy link
Collaborator

This can be a nice initiative to extend checkpointz, should not be too difficult since that infra can already serve states, they just need to expose another one

@paulhauner
Copy link
Member Author

would we be hosting the compressed genesis files ?

Yep, that sounds like a good idea to me!

@paulhauner
Copy link
Member Author

FYI we're expecting to release v4.4.0 on/around the 31st of August. The primary goal of v4.4.0 is to add support for --network holesky. The Holesky genesis state doesn't exist yet, but I expect to see it at any time now.

I think that this issue (state compression and downloading) is going to be critical for that release. @eserilev I'm happy for you to take this issue if you'd like to (you've done lots of great work for LH), but I'd like to give you the option to pass if you're not comfortable with the time pressure. Could you please let me know if you'd still like to tackle this issue? No pressure either way ☺️

@eserilev
Copy link
Collaborator

@paulhauner no problem, ill get a PR up for review shortly

@barnabasbusa
Copy link

Genesis state now exists and it is 198MB.

@paulhauner
Copy link
Member Author

I'm pushing this to the next release since we have #4653 which adds Holesky.

@paulhauner paulhauner added v4.5.0 ETA Q4 2023 and removed v4.4.1 ETA August 2023 labels Aug 24, 2023
bors bot pushed a commit that referenced this issue Aug 28, 2023
## Issue Addressed

NA

## Proposed Changes

Add the Holesky network config as per https://github.com/eth-clients/holesky/tree/36e4ff2d5138dcb2eb614f0f60fdb060b2adc1e2/custom_config_data.

Since the genesis state is ~190MB, I've opted to *not* include it in the binary and instead download it at runtime (see #4564 for context). To download this file we have:

- A hard-coded URL for a SigP-hosted S3 bucket with the Holesky genesis state. Assuming this download works correctly, users will be none the wiser that the state wasn't included in the binary (apart from some additional logs)
- If the user provides a `--checkpoint-sync-url` flag, then LH will download the genesis state from that server rather than our S3 bucket.
- If the user provides a `--genesis-state-url` flag, then LH will download the genesis state from that server regardless of the S3 bucket or `--checkpoint-sync-url` flag.
- Whenever a genesis state is downloaded it is checked against a checksum baked into the binary.
- A genesis state will never be downloaded if it's already included in the binary.
- There is a `--genesis-state-url-timeout` flag to tweak the timeout for downloading the genesis state file.

## Log Output

Example of log output when a state is downloaded:

```bash
Aug 23 05:40:13.424 INFO Logging to file                         path: "/Users/paul/.lighthouse/holesky/beacon/logs/beacon.log"
Aug 23 05:40:13.425 INFO Lighthouse started                      version: Lighthouse/v4.3.0-bd9931f+
Aug 23 05:40:13.425 INFO Configured for network                  name: holesky
Aug 23 05:40:13.426 INFO Data directory initialised              datadir: /Users/paul/.lighthouse/holesky
Aug 23 05:40:13.427 INFO Deposit contract                        address: 0x4242424242424242424242424242424242424242, deploy_block: 0
Aug 23 05:40:13.427 INFO Downloading genesis state               info: this may take some time on testnets with large validator counts, timeout: 60s, server: https://sigp-public-genesis-states.s3.ap-southeast-2.amazonaws.com/
Aug 23 05:40:29.895 INFO Starting from known genesis state       service: beacon
```

Example of log output when there are no URLs specified:

```
Aug 23 06:29:51.645 INFO Logging to file                         path: "/Users/paul/.lighthouse/goerli/beacon/logs/beacon.log"
Aug 23 06:29:51.646 INFO Lighthouse started                      version: Lighthouse/v4.3.0-666a39c+
Aug 23 06:29:51.646 INFO Configured for network                  name: goerli
Aug 23 06:29:51.647 INFO Data directory initialised              datadir: /Users/paul/.lighthouse/goerli
Aug 23 06:29:51.647 INFO Deposit contract                        address: 0xff50ed3d0ec03ac01d4c79aad74928bff48a7b2b, deploy_block: 4367322
The genesis state is not present in the binary and there are no known download URLs. Please use --checkpoint-sync-url or --genesis-state-url.
```

## Additional Info

I tested the `--genesis-state-url` flag with all 9 Goerli checkpoint sync servers on https://eth-clients.github.io/checkpoint-sync-endpoints/ and they all worked 🎉 

My IDE eagerly formatted some `Cargo.toml`. I've disabled it but I don't see the value in spending time reverting the changes that are already there.

I also added the `GenesisStateBytes` enum to avoid an unnecessary clone on the genesis state bytes baked into the binary. This is not a huge deal on Mainnet, but will become more relevant when testing with big genesis states.

When we do a fresh checkpoint sync we're downloading the genesis state to check the `genesis_validators_root` against the finalised state we receive. This is not *entirely* pointless, since we verify the checksum when we download the genesis state so we are actually guaranteeing that the finalised state is on the same network. There might be a smarter/less-download-y way to go about this, but I've run out of cycles to figure that out. Perhaps we can grab it in the next release?
jxs pushed a commit to jxs/lighthouse that referenced this issue Aug 28, 2023
## Issue Addressed

NA

## Proposed Changes

Add the Holesky network config as per https://github.com/eth-clients/holesky/tree/36e4ff2d5138dcb2eb614f0f60fdb060b2adc1e2/custom_config_data.

Since the genesis state is ~190MB, I've opted to *not* include it in the binary and instead download it at runtime (see sigp#4564 for context). To download this file we have:

- A hard-coded URL for a SigP-hosted S3 bucket with the Holesky genesis state. Assuming this download works correctly, users will be none the wiser that the state wasn't included in the binary (apart from some additional logs)
- If the user provides a `--checkpoint-sync-url` flag, then LH will download the genesis state from that server rather than our S3 bucket.
- If the user provides a `--genesis-state-url` flag, then LH will download the genesis state from that server regardless of the S3 bucket or `--checkpoint-sync-url` flag.
- Whenever a genesis state is downloaded it is checked against a checksum baked into the binary.
- A genesis state will never be downloaded if it's already included in the binary.
- There is a `--genesis-state-url-timeout` flag to tweak the timeout for downloading the genesis state file.

## Log Output

Example of log output when a state is downloaded:

```bash
Aug 23 05:40:13.424 INFO Logging to file                         path: "/Users/paul/.lighthouse/holesky/beacon/logs/beacon.log"
Aug 23 05:40:13.425 INFO Lighthouse started                      version: Lighthouse/v4.3.0-bd9931f+
Aug 23 05:40:13.425 INFO Configured for network                  name: holesky
Aug 23 05:40:13.426 INFO Data directory initialised              datadir: /Users/paul/.lighthouse/holesky
Aug 23 05:40:13.427 INFO Deposit contract                        address: 0x4242424242424242424242424242424242424242, deploy_block: 0
Aug 23 05:40:13.427 INFO Downloading genesis state               info: this may take some time on testnets with large validator counts, timeout: 60s, server: https://sigp-public-genesis-states.s3.ap-southeast-2.amazonaws.com/
Aug 23 05:40:29.895 INFO Starting from known genesis state       service: beacon
```

Example of log output when there are no URLs specified:

```
Aug 23 06:29:51.645 INFO Logging to file                         path: "/Users/paul/.lighthouse/goerli/beacon/logs/beacon.log"
Aug 23 06:29:51.646 INFO Lighthouse started                      version: Lighthouse/v4.3.0-666a39c+
Aug 23 06:29:51.646 INFO Configured for network                  name: goerli
Aug 23 06:29:51.647 INFO Data directory initialised              datadir: /Users/paul/.lighthouse/goerli
Aug 23 06:29:51.647 INFO Deposit contract                        address: 0xff50ed3d0ec03ac01d4c79aad74928bff48a7b2b, deploy_block: 4367322
The genesis state is not present in the binary and there are no known download URLs. Please use --checkpoint-sync-url or --genesis-state-url.
```

## Additional Info

I tested the `--genesis-state-url` flag with all 9 Goerli checkpoint sync servers on https://eth-clients.github.io/checkpoint-sync-endpoints/ and they all worked 🎉 

My IDE eagerly formatted some `Cargo.toml`. I've disabled it but I don't see the value in spending time reverting the changes that are already there.

I also added the `GenesisStateBytes` enum to avoid an unnecessary clone on the genesis state bytes baked into the binary. This is not a huge deal on Mainnet, but will become more relevant when testing with big genesis states.

When we do a fresh checkpoint sync we're downloading the genesis state to check the `genesis_validators_root` against the finalised state we receive. This is not *entirely* pointless, since we verify the checksum when we download the genesis state so we are actually guaranteeing that the finalised state is on the same network. There might be a smarter/less-download-y way to go about this, but I've run out of cycles to figure that out. Perhaps we can grab it in the next release?
@paulhauner paulhauner removed the v4.5.0 ETA Q4 2023 label Sep 20, 2023
@paulhauner paulhauner added the v4.6.0 ETA Q1 2024 label Sep 20, 2023
@michaelsproul michaelsproul removed the v4.6.0 ETA Q1 2024 label Dec 15, 2023
Woodpile37 pushed a commit to Woodpile37/lighthouse that referenced this issue Jan 6, 2024
NA

Add the Holesky network config as per https://github.com/eth-clients/holesky/tree/36e4ff2d5138dcb2eb614f0f60fdb060b2adc1e2/custom_config_data.

Since the genesis state is ~190MB, I've opted to *not* include it in the binary and instead download it at runtime (see sigp#4564 for context). To download this file we have:

- A hard-coded URL for a SigP-hosted S3 bucket with the Holesky genesis state. Assuming this download works correctly, users will be none the wiser that the state wasn't included in the binary (apart from some additional logs)
- If the user provides a `--checkpoint-sync-url` flag, then LH will download the genesis state from that server rather than our S3 bucket.
- If the user provides a `--genesis-state-url` flag, then LH will download the genesis state from that server regardless of the S3 bucket or `--checkpoint-sync-url` flag.
- Whenever a genesis state is downloaded it is checked against a checksum baked into the binary.
- A genesis state will never be downloaded if it's already included in the binary.
- There is a `--genesis-state-url-timeout` flag to tweak the timeout for downloading the genesis state file.

Example of log output when a state is downloaded:

```bash
Aug 23 05:40:13.424 INFO Logging to file                         path: "/Users/paul/.lighthouse/holesky/beacon/logs/beacon.log"
Aug 23 05:40:13.425 INFO Lighthouse started                      version: Lighthouse/v4.3.0-bd9931f+
Aug 23 05:40:13.425 INFO Configured for network                  name: holesky
Aug 23 05:40:13.426 INFO Data directory initialised              datadir: /Users/paul/.lighthouse/holesky
Aug 23 05:40:13.427 INFO Deposit contract                        address: 0x4242424242424242424242424242424242424242, deploy_block: 0
Aug 23 05:40:13.427 INFO Downloading genesis state               info: this may take some time on testnets with large validator counts, timeout: 60s, server: https://sigp-public-genesis-states.s3.ap-southeast-2.amazonaws.com/
Aug 23 05:40:29.895 INFO Starting from known genesis state       service: beacon
```

Example of log output when there are no URLs specified:

```
Aug 23 06:29:51.645 INFO Logging to file                         path: "/Users/paul/.lighthouse/goerli/beacon/logs/beacon.log"
Aug 23 06:29:51.646 INFO Lighthouse started                      version: Lighthouse/v4.3.0-666a39c+
Aug 23 06:29:51.646 INFO Configured for network                  name: goerli
Aug 23 06:29:51.647 INFO Data directory initialised              datadir: /Users/paul/.lighthouse/goerli
Aug 23 06:29:51.647 INFO Deposit contract                        address: 0xff50ed3d0ec03ac01d4c79aad74928bff48a7b2b, deploy_block: 4367322
The genesis state is not present in the binary and there are no known download URLs. Please use --checkpoint-sync-url or --genesis-state-url.
```

I tested the `--genesis-state-url` flag with all 9 Goerli checkpoint sync servers on https://eth-clients.github.io/checkpoint-sync-endpoints/ and they all worked 🎉

My IDE eagerly formatted some `Cargo.toml`. I've disabled it but I don't see the value in spending time reverting the changes that are already there.

I also added the `GenesisStateBytes` enum to avoid an unnecessary clone on the genesis state bytes baked into the binary. This is not a huge deal on Mainnet, but will become more relevant when testing with big genesis states.

When we do a fresh checkpoint sync we're downloading the genesis state to check the `genesis_validators_root` against the finalised state we receive. This is not *entirely* pointless, since we verify the checksum when we download the genesis state so we are actually guaranteeing that the finalised state is on the same network. There might be a smarter/less-download-y way to go about this, but I've run out of cycles to figure that out. Perhaps we can grab it in the next release?
Woodpile37 pushed a commit to Woodpile37/lighthouse that referenced this issue Jan 6, 2024
NA

Add the Holesky network config as per https://github.com/eth-clients/holesky/tree/36e4ff2d5138dcb2eb614f0f60fdb060b2adc1e2/custom_config_data.

Since the genesis state is ~190MB, I've opted to *not* include it in the binary and instead download it at runtime (see sigp#4564 for context). To download this file we have:

- A hard-coded URL for a SigP-hosted S3 bucket with the Holesky genesis state. Assuming this download works correctly, users will be none the wiser that the state wasn't included in the binary (apart from some additional logs)
- If the user provides a `--checkpoint-sync-url` flag, then LH will download the genesis state from that server rather than our S3 bucket.
- If the user provides a `--genesis-state-url` flag, then LH will download the genesis state from that server regardless of the S3 bucket or `--checkpoint-sync-url` flag.
- Whenever a genesis state is downloaded it is checked against a checksum baked into the binary.
- A genesis state will never be downloaded if it's already included in the binary.
- There is a `--genesis-state-url-timeout` flag to tweak the timeout for downloading the genesis state file.

Example of log output when a state is downloaded:

```bash
Aug 23 05:40:13.424 INFO Logging to file                         path: "/Users/paul/.lighthouse/holesky/beacon/logs/beacon.log"
Aug 23 05:40:13.425 INFO Lighthouse started                      version: Lighthouse/v4.3.0-bd9931f+
Aug 23 05:40:13.425 INFO Configured for network                  name: holesky
Aug 23 05:40:13.426 INFO Data directory initialised              datadir: /Users/paul/.lighthouse/holesky
Aug 23 05:40:13.427 INFO Deposit contract                        address: 0x4242424242424242424242424242424242424242, deploy_block: 0
Aug 23 05:40:13.427 INFO Downloading genesis state               info: this may take some time on testnets with large validator counts, timeout: 60s, server: https://sigp-public-genesis-states.s3.ap-southeast-2.amazonaws.com/
Aug 23 05:40:29.895 INFO Starting from known genesis state       service: beacon
```

Example of log output when there are no URLs specified:

```
Aug 23 06:29:51.645 INFO Logging to file                         path: "/Users/paul/.lighthouse/goerli/beacon/logs/beacon.log"
Aug 23 06:29:51.646 INFO Lighthouse started                      version: Lighthouse/v4.3.0-666a39c+
Aug 23 06:29:51.646 INFO Configured for network                  name: goerli
Aug 23 06:29:51.647 INFO Data directory initialised              datadir: /Users/paul/.lighthouse/goerli
Aug 23 06:29:51.647 INFO Deposit contract                        address: 0xff50ed3d0ec03ac01d4c79aad74928bff48a7b2b, deploy_block: 4367322
The genesis state is not present in the binary and there are no known download URLs. Please use --checkpoint-sync-url or --genesis-state-url.
```

I tested the `--genesis-state-url` flag with all 9 Goerli checkpoint sync servers on https://eth-clients.github.io/checkpoint-sync-endpoints/ and they all worked 🎉

My IDE eagerly formatted some `Cargo.toml`. I've disabled it but I don't see the value in spending time reverting the changes that are already there.

I also added the `GenesisStateBytes` enum to avoid an unnecessary clone on the genesis state bytes baked into the binary. This is not a huge deal on Mainnet, but will become more relevant when testing with big genesis states.

When we do a fresh checkpoint sync we're downloading the genesis state to check the `genesis_validators_root` against the finalised state we receive. This is not *entirely* pointless, since we verify the checksum when we download the genesis state so we are actually guaranteeing that the finalised state is on the same network. There might be a smarter/less-download-y way to go about this, but I've run out of cycles to figure that out. Perhaps we can grab it in the next release?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
optimization Something to make Lighthouse run more efficiently.
Projects
None yet
Development

No branches or pull requests

6 participants