-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(consensus): add L2 contracts VM storage reader (BFT-434) #2416
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few high-level comments. I understand that the PR is still in draft, but the comments are mostly about design.
consensus_authority_contract: Contract, | ||
validator_registry_address: Option<Address>, | ||
validator_registry_contract: Contract, | ||
attester_registry_address: Option<Address>, | ||
attester_registry_contract: Contract, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need 3 dedicated contracts for consensus? Our contracts are already very sophisticated, so my fear is that it would complicate them even more.
I do not have concrete proposals, but I strongly suggest discussing the contracts design with @vladbochok and @StanislavBreadless
Not actionable in this PR I guess.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's my fault. I did the design for the contracts and thought that smaller contracts would be preferred. They can be merged into a single larger contract though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes to contracts here: matter-labs/era-contracts@0c2498f
let consensus_authority_contract = load_contract("contracts/l2-contracts/artifacts-zk/contracts/ConsensusAuthority.sol/ConsensusAuthority.json"); | ||
let validator_registry_contract = load_contract("contracts/l2-contracts/artifacts-zk/contracts/ValidatorRegistry.sol/ValidatorRegistry.json"); | ||
let attester_registry_contract = load_contract("contracts/l2-contracts/artifacts-zk/contracts/AttesterRegistry.sol/AttesterRegistry.json"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
zksync_contracts
crate currently serves (somewhat) as a schema for our FS dependencies, so probably it would be better to load the contracts there.
You can create a module for consensus_l2_contracts
and load them via pub fn load_consensus_l2_contracts() -> Option<ConsensusContracts>
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed here: 5a925c6
let num_committee_validators = self.read_num_committee_validators(ctx, block_id).await; | ||
for i in 0..num_committee_validators { | ||
let committee_validator = self.read_committee_validator(ctx, block_id, i).await; | ||
let validator = self | ||
.read_validator(ctx, block_id, committee_validator.0) | ||
.await; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You seem to instantiate a new DB connection and VM instance for each read request, and it's a somewhat expensive operation. (Actually likely several ones, because tx_sender
would also create connections internally).
If we want to actually run the VM, I'd expect us to do it efficiently, roughy:
let vm = self.instantiate_vm().await;
let num_committee_validators = self.read_num_committee_validators(&vm, block_id).await;
// ^ run on an already instantiated VM.
// ... same for the rest of the function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it was just a PoC.
It was fixed in a8779b5, so that only one connection will be acquired per read session. It does not however (yet) re-use that connection within the execution sandbox. The connection is only used to load block args.
) -> (usize, Vec<u8>, Vec<u8>, bool) { | ||
let func = self | ||
.validator_registry_contract | ||
.function("validators") | ||
.unwrap() | ||
.clone(); | ||
let tx = self.gen_l2_call_tx( | ||
self.validator_registry_address.unwrap(), | ||
func.encode_input(&[Token::Address(node_owner)]).unwrap(), | ||
); | ||
let res = self.eth_call(ctx, block_id, tx).await; | ||
let tokens = func.decode_output(&res).unwrap(); | ||
( | ||
tokens[0].clone().into_uint().unwrap().as_usize(), | ||
tokens[1].clone().into_bytes().unwrap(), | ||
tokens[2].clone().into_bytes().unwrap(), | ||
tokens[3].clone().into_bool().unwrap(), | ||
) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably it would be better to implement zksync_types::web3::contract::Tokenize
on all the required structures instead of mixing parsing with the business logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes here: 477ea5e
let block_args = zksync_node_api_server::execution_sandbox::BlockArgs::new( | ||
&mut conn, | ||
block_id, | ||
&start_info, | ||
) | ||
.await | ||
.unwrap(); | ||
let call_overrides = CallOverrides { | ||
enforced_base_fee: None, | ||
}; | ||
|
||
let res = self | ||
.tx_sender | ||
.eth_call(block_args, call_overrides, tx) | ||
.await | ||
.unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to consider running VM to query contracts (as opposed to pre-calculating storage slots and reading storage directly), I think we need to do quite some preparation work:
- Extract
execution_vm
into a separate, general-purpose crate. - Change the sandbox so that it can perform multiple requests on a single instance.
- Decouple it from
TxSender
.
Right now, tx_sender
and dependency on zksync_node_api_server
feel like a pretty huge hack, and I'm not sure if it's a good idea to depend on that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can certainly see pros of using VM with eth_call
, but I believe that implementing it properly would take a lot of engineering efforts. As you can see, the first try is implemented in a hacky way, and making a proper implementation (e.g. detaching sandbox from API) is a complex task even for seasoned platform developers. I would really like to not use hacky ways, as they may backfire, especially given that consensus will be a critical component for the sequencer.
On the other hand, reading storage directly will never look as "good", but it's relatively easy to implement and cover with unit tests (e.g. change value via contract method, then read from storage directly) to make sure that the storage layout doesn't change.
If we can afford to spend time on extracting the sandbox from API, I'm all for it. If not, I would rather prefer direct storage access than hacks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that the current solution is a workaround/hack as per the code dependency graph and the lack of execution sandbox instance re-use.
However, falling back to direct storage reads is worse IMHO.
The reads are not targeting U256 variables in which their storage slots’ keys can be easily pre-calculated, be fetched and decoded. It involves dynamic-sized arrays, structs, and dynamic-sized arrays within the structs. This implies packing/padding, which means to re-implement the VM codec.
Meanwhile, I couldn’t find any relevant documentation, and there’s very limited code examples (mostly in zksync_types::utils
), for much simpler use-cases.
I went ahead to experiment with it, and couldn’t even retrieve a simple U256 variable slot, following canonical solc
storage layout. I might have overlooked something, but couldn’t reverse-engineer the slots back to the pre-hashed keys. In the overall, I don’t think it’s a good idea to go forward with this given the complex data types involved.
Also, I don’t think such approach is testable, because the layout is supposedly an implementation detail, unless there’s a clear open available specs, backwards compatibility, and lack of potential optimizations.
Read-transactions are the standard way to read from the VM storage database, while incapsulating the layout and the codec. The consensus component functions as a standard web3 client in this regard, and having direct access to the database does not imply it should use it to bypass the standard interface.
I suggest to continue with the current approach, and to implement your suggestions for the execution sandbox decoupling, in a follow-up PR. (see BFT-493)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1. That was also the protocol teams's opinion, that we should avoid direct storage reads if at all possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't mind as long as refactoring will really be planned & executed.
url = https://github.com/matter-labs/era-contracts | ||
branch = consensus_contracts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JFYI: I don't think the PR is mergeable until we can merge the contracts to main (e.g. we cannot live with a separate branch in submodules).
I think we should start a discussion with the protocol team on how to merge these changes to main
without having to finish/audit them, and meanwhile minimize risks of them somehow affecting production environments.
I have a few ideas, let's discuss in Slack. cc @RomanBrodetski @brunoffranca
self.tx_sender | ||
.eth_call(block_args, call_overrides, tx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Continuing the discussion from this comment.
tx_sender.eth_call
will instantiate a connection, then VM, fetch a bunch of stuff from Postgres, and only then perform an eth_call
. So e.g. in read_attester_committee
method above you will create attester_committee_size + 1
connections sequentially, with quite some overhead.
I'm still concerned that the performance here may be suboptimal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, connections are acquired per-call by the execution sandbox. This is indeed suboptimal, and should be part of the needed refactoring. (BFT-493)
/// A struct for reading data from consensus L2 contracts. | ||
pub struct VMReader { | ||
pool: ConnectionPool, | ||
tx_sender: TxSender, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A general observation, not sure if it's a concern. TxSender
uses replica DB to read state, so there are a few problems here:
- Replication lag may slow the consensus down.
- Replica DBs are used by API, and the load there is much higher, so the latency may be less predictable than with main DB.
I guess this can be "quickly solved" by pointing replica DB url to main DB on the components where core runs, but it feels even more of a hack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once execution sandbox will be decoupled, we could use it with main DB connection pool, and no changes to TxSender
would be required.
@@ -1,18 +1,24 @@ | |||
//! Storage test helpers. | |||
|
|||
use anyhow::Context as _; | |||
use zksync_concurrency::{ctx, error::Wrap as _, time}; | |||
use zksync_concurrency::{ctx, ctx::Ctx, error::Wrap as _, time}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: don't import ctx::Ctx, use the qualified name.
&mut self, | ||
ctx: &Ctx, | ||
owner: Address, | ||
nodes: &[&[Token]], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can the conversion be done inside the function, passing &[&[Token]] is very loosely typed.
.function("setAttesterCommittee") | ||
.unwrap() | ||
.short_signature() | ||
.to_vec(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all these methods wrapping raw contract API deserve a dedicated module.
let tx = self.gen_tx_add(®istry_contract.contract, deploy_tx.address, node); | ||
txs.push(tx); | ||
} | ||
txs.push( | ||
self.gen_tx_set_validator_committee(deploy_tx.address, ®istry_contract.contract), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any rason why the contract
and the address
parameters are in opposite order?
contract: &Contract, | ||
) -> Transaction { | ||
let calldata = contract | ||
.function("setValidatorCommittee") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't see this method in matter-labs/era-contracts#555 ; is it commitValidators ?
U256::from(rng.gen::<usize>()), | ||
(0..256).map(|_| rng.gen()).collect::<Vec<u8>>(), | ||
) | ||
.into_tokens(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe you could use abigen! to generate bindings for all the Solidity structs and then you get the tokenization with static typing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We didn't migrate to ethers
just yet (and in fact it's already deprecated in favor of alloy
), so using it is not recommended for now. We want the migration to be done consistently, otherwise, we'll end up with a mix of our vendored web3
remains and some of the newer stuff.
let func = self | ||
.registry_contract | ||
.function("validatorCommitteeSize") | ||
.unwrap() | ||
.clone(); | ||
|
||
let tx = self.gen_l2_call_tx(self.registry_address, func.short_signature().to_vec()); | ||
|
||
let res = self.eth_call(block_args, tx).await; | ||
|
||
let tokens = func.decode_output(&res).context("decode_output()")?; | ||
U256::from_tokens(tokens) | ||
.context("U256::from_tokens()") | ||
.map(|t| t.as_usize()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be feasible to do a bit more encapsulation? For example to create a DeployedContract
or similar type which has an Address
and a Contract
, then wrap that into a RegistryContract
which has methods for these that take whatever is needed to do the eth_call
but otherwise know which function to call and how to parse the result? Maybe we can use the abigen!
macro to generate the bindings if the ABI JSON is available at compile time?
The VMReader
sounds more general than just having all these methods for this particular contract, even if it's the only one at the moment.
I did something similar here where a gateway contract implemented multiple facets, each with their own binding, each represented by a contract caller that took the equivalent of the TxSender
as an input. I suppose that gateway is similar to the VMReader
, but a lot of the boilerplate and the private methods here could be delegated to contract specific structs, leaving only the highest level stuff in the reader.
let tx = self.gen_l2_call_tx( | ||
self.registry_address, | ||
func.encode_input(&zksync_types::U256::from(idx).into_tokens()) | ||
.unwrap(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These unwraps could be at least expect
with the method name to give us some context if they fail, in case we can't use bindings.
let func = self | ||
.registry_contract | ||
.function("attesterCommitteeSize") | ||
.unwrap() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A contract wrapper could provide a method to get the function by name and make sure if it's not found the error explains what it was looking for.
pub pop: Vec<u8>, | ||
} | ||
|
||
impl Detokenize for CommitteeValidator { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is an attribute in ethers
that should work here even without generating the bindings. Here's an example.
closing this pr, I'll continue this work in #2684 |
What ❔
Adding
eth_call
-based storage reader for consensus L2 contracts.Why ❔
Enable consensus executor to read from L2 storage per specific block.
Notes
consensus::storage::VMReader
added to incapsulate read-transactions dispatching against L2 storage.consensus::storage::testonly::VMWriter
added in order to simulate contracts deployments and data initialization in unit tests.consensus::storage::tests::test_vm_reader()
test added.TODOs
contracts
git submodule custom branch reference once feat(consensus): add L2 registry contract (BFT-434) era-contracts#555 is merged.Checklist
zk fmt
andzk lint
.