Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs(yp): Palla/yp review contract deployment #4249

Merged

Conversation

iAmMichaelConnor
Copy link
Contributor

Please read contributing guidelines and remove this line.

@iAmMichaelConnor iAmMichaelConnor changed the title Palla/yp review contract deployment docs(yp): Palla/yp review contract deployment Jan 28, 2024
@AztecBot
Copy link
Collaborator

Benchmark results

Metrics with a significant change:

  • note_trial_decrypting_time_in_ms (32): 27.7 (-79%)
Detailed results

All benchmarks are run on txs on the Benchmarking contract on the repository. Each tx consists of a batch call to create_note and increment_balance, which guarantees that each tx has a private call, a nested private call, a public call, and a nested public call, as well as an emitted private note, an unencrypted log, and public storage read and write.

This benchmark source data is available in JSON format on S3 here.

Values are compared against data from master at commit 60b59dac and shown if the difference exceeds 1%.

L2 block published to L1

Each column represents the number of txs on an L2 block published to L1.

Metric 8 txs 32 txs 128 txs
l1_rollup_calldata_size_in_bytes 45,444 179,588 716,132
l1_rollup_calldata_gas 222,924 867,860 3,449,888
l1_rollup_execution_gas 936,257 3,970,523 24,793,613
l2_block_processing_time_in_ms 1,336 5,116 20,953 (-1%)
note_successful_decrypting_time_in_ms 322 (+1%) 1,002 3,638
note_trial_decrypting_time_in_ms 32.1 (+7%) ⚠️ 27.7 (-79%) 137 (-3%)
l2_block_building_time_in_ms 14,980 59,803 238,174
l2_block_rollup_simulation_time_in_ms 10,638 42,624 (+1%) 169,912
l2_block_public_tx_process_time_in_ms 4,308 17,104 67,991

L2 chain processing

Each column represents the number of blocks on the L2 chain where each block has 16 txs.

Metric 5 blocks 10 blocks
node_history_sync_time_in_ms 15,449 (+3%) 29,586 (+3%)
note_history_successful_decrypting_time_in_ms 2,497 (+1%) 4,935 (+8%)
note_history_trial_decrypting_time_in_ms 78.6 (+1%) 153 (+9%)
node_database_size_in_bytes 3,616,695 3,730,112
pxe_database_size_in_bytes 29,923 59,478

Circuits stats

Stats on running time and I/O sizes collected for every circuit run across all benchmarks.

Circuit circuit_simulation_time_in_ms circuit_input_size_in_bytes circuit_output_size_in_bytes
private-kernel-init 212 44,709 25,561
private-kernel-ordering 139 (+1%) 30,953 14,809
base-rollup 1,081 128,374 881
root-rollup 82.9 4,088 889
private-kernel-inner 288 71,236 25,561
public-kernel-private-input 203 31,891 25,561
public-kernel-non-first-iteration 201 31,933 25,561
merge-rollup 9.80 (+1%) 2,608 881

Tree insertion stats

The duration to insert a fixed batch of leaves into each tree type.

Metric 1 leaves 2 leaves 8 leaves 16 leaves 32 leaves 64 leaves 128 leaves 512 leaves 1024 leaves 2048 leaves 8192 leaves
batch_insert_into_append_only_tree_ms 11.2 (+2%) 92.8 (+9%) 15.5 (+6%) 18.1 (+2%) 23.8 (-1%) 49.9 (+1%) 94.0 (+7%) 243 487 (+2%) 909 3,546
batch_insert_into_indexed_tree_ms N/A N/A N/A 64.7 (+1%) N/A 59.7 (+1%) 117 (+6%) 360 743 (+2%) 1,402 5,541 (-4%)

Miscellaneous

Transaction sizes based on how many contracts are deployed in the tx.

Metric 0 deployed contracts 1 deployed contracts
tx_size_in_bytes 15,487 31,904

Transaction processing duration by data writes.

Metric 0 new commitments 1 new commitments
tx_pxe_processing_time_ms 346 (+1%) 969
Metric 1 public data writes
tx_sequencer_processing_time_ms 531

public_keys_hash = pedersen([nullifier_pubkey, tagging_pubkey, incoming_view_pubkey, outgoing_view_pubkey], GENERATOR__PUBLIC_KEYS)
```

<!-- TODO(@iamMichaelConnor): Can we just hash the `x` coordinate for each pubkey and be done with it, for the sake of the hashing? Or should we also include the `y` coordinate? Or at least the sign bit? If we include the sign bit, does it make sense to "compress" all the sign bits in a single field, or is it better to expand it? -->
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using just the x coordinate would make me nervous, given a (now-patched) Aztec connect bug which tried to use just an x-coordinate. It might be safe, but it would need thought, so for now we should prioritise security over efficiency.
I suppose the signed bits could be compressed into a single field. We'd need to compare the efficiency of:

  • Hashing 8 fields (4 points), vs
  • Hashing 5 fields (4 x-coords, and 1 sign-field (Seinfeld?!); bit-decomposing the sign-field; possibly range-checking the sign-field to be <= 4; computing the value of y; and computing the correctly signed y.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we just restrict x coordinates to all have the same "sign", similar to how it's done in Ethereum-land during ecrecover?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quite possibly. We'll need to find a way to convey such optimisation proposals to the crypto research team, in such a way that they have enough background to comment.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussion in Slack here

<!-- prettier-ignore -->
| Field | Type | Description |
|----------|----------|----------|
| `function_selector` | `u32` | Selector of the function. Calculated as the hash of the method name and parameters. The specification of this is left to the compiler and not enforced by the protocol. |
| `function_selector` | `u32` | Selector of the function. Calculated as the hash of the method name and parameters. The specification of this is not enforced by the protocol. |
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might a salt still be useful?
If someone uses an off-the-shelf private function, observers of the broadcast function leaves would be able to learn (through a dictionary/rainbow attack) the private functions which have been deployed.
Or would the function leaves not be broadcast if someone wants to keep their private functions private?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You just made me realize there's no need to broadcast private function selector and vkhash in the ContractClassRegisterer. As long as we broadcast the merkle root, we're good.

The point here is that any user who wants to send a private tx that goes through a private function in a contract will need the contract artifact (which includes the bytecode) anyway, so broadcasting the function selector and vkhash doesn't add anything. The only important thing to broadcast during class registration is public bytecode.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good point!

yellow-paper/docs/contract-deployment/classes.md Outdated Show resolved Hide resolved
yellow-paper/docs/contract-deployment/classes.md Outdated Show resolved Hide resolved

## Public Deployment
Private functions in the contract should require a merkle membership proof for the Initialization Nullifier, to prevent them from being called before the constructor is invoked. Nevertheless, a contract may choose to allow some functions to be called before initialization, such as in the case of [Diversified and Stealth account contracts](../addresses-and-keys/diversified-and-stealth.md). Public functions, on the other hand, should not need this check, since the Public Kernel Circuit guarantees that the contract is Publicly Deployed, and a contract cannot be Publicly Deployed without being Initialized. Note that every Contract Class needs to have at least one constructor if an instance of it is intended to be Publicly Deployed.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that every Contract Class needs to have at least one constructor if an instance of it is intended to be Publicly Deployed.

Is this true? Is it not possible for the ContractInstanceDeployer to emit an Initialisation Nullifier where the underlying initialization_hash is 0?

Copy link
Collaborator

@spalladino spalladino Jan 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, because the initialization hash is emitted from the contract itself, not from the Deployer.

Still, we can remove the restriction that "a contract needs to be initialized in order to be publicly deployed" as suggested in "Allowing Uninitialized Publicly Deployed Instances", which simplifies things (for us).

What should `contract_args_hash` be if there is no constructor? `0`?
Q: why "contract_args_hash" and not "constructor_args_hash"?
-->
Note the absence of a deployer field. This is because the deployer address is optionally mixed into the `salt` as part of the deployment logic, implemented by the `ContractInstanceDeployer` as described below.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this as an optimisation to reduce hashing?
Readers might get confused that there's no "deployer address", but then most conventions described below describe a deployer address being mixed with the salt. It might be clearer to keep both separate concepts (deployer and salt)?
Or maybe I'm just not appreciating why this has been done

Copy link
Collaborator

@spalladino spalladino Jan 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. It was meant as an optimization, since deployer was not enforced by the protocol at all, but rather by the deployer contract, and it could be mixed-in with the salt. But it ends up being more confusing, so better just keep it as a separate field. I'll add it back.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the deployer again to the struct, which gets hashed along with the init_args and salt when computing the address:

salted_initialization_hash = pedersen([salt, initialization_hash, deployer as Field, portal_contract_address as Field], GENERATOR__SALTED_INITIALIZATION_HASH)
partial_address = pedersen([contract_class_id, salted_initialization_hash], GENERATOR__CONTRACT_PARTIAL_ADDRESS_V1)
address = pedersen([public_keys_hash, partial_address], GENERATOR__CONTRACT_ADDRESS_V1)

The deployer address of a contract instance is used to restrict who can initialize the contract (ie call its constructor) and who can publicly deploy it. Note that neither of these checks are enforced by the protocol: the initialization is checked by the constructor itself, and the deployment by the ContractInstanceDeployer. Furthermore, a contract class may choose to not enforce this restriction by removing the check from the constructor.

The deployer address can be set to zero to signal that anyone can initialize or publicly deploy an instance.

yellow-paper/docs/contract-deployment/instances.md Outdated Show resolved Hide resolved

However, this opens the door for a contract to be simultaneously Publicly Deployed and Uninitialized, which is a state that does not seem to map to a valid use case. Furthermore, it requires public functions to check the Initialization Nullifier on every call, which in the current approach is not needed as the presence of the Deployment Nullifier checked by the Public Kernel is enough of a guarantee that the contract was initialized.

<!-- To @iAmMichaelConnor: I am tempted to actually implement this approach in spite of its drawbacks. It makes responsibilities so much more clearly defined: Deployer deploys, Initializer initializes, and there is no link between the two. It also allows removing constructors when they are not needed. And the added cost of checking the initialization nullifier from public functions doesn't seem too terrible. -->
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be a bit confused. Is the purpose of "deploy" solely to enforce the bytecode to be broadcast?

If so, I suppose the sequencer would only need to check for a deployment nullifier if it doesn't exist in the tree. If the deployment nullifier does exist, the sequencer doesn't need to prove that fact, because they clearly have the bytecode and they can prove this bytecode is correct via the initialisation nullifier.
If the merkle membership/non-membership check for the deployment nullifier is done in the kernel circuit, then this observation doesn't help much: the kernel circuit is a fixed circuit with no branching, so the hashing will always happen. But if there's a way for the hashing to be done within the AVM, then the sequencer can skip the merkle membership proof unless they want to claim gas fees for a non-deployed contract.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the purpose of "deploy" solely to enforce the bytecode to be broadcast?

Yep!

If the deployment nullifier does exist, the sequencer doesn't need to prove that fact, because they clearly have the bytecode and they can prove this bytecode is correct via the initialisation nullifier.

Not sure I agree. Even if the sequencer can prove that they know the bytecode, it's important they prove that everyone knows the bytecode of the public function being executed. Otherwise, we open the door to public functions that can only be run by a subset of nodes that are privy to that bytecode, which doesn't feel right.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the merkle membership/non-membership check for the deployment nullifier is done in the kernel circuit, then this observation doesn't help much: the kernel circuit is a fixed circuit with no branching, so the hashing will always happen.

Good point! Given this, I'm even more tempted to go down this route and completely split initialization and deployment.

But if there's a way for the hashing to be done within the AVM, then the sequencer can skip the merkle membership proof unless they want to claim gas fees for a non-deployed contract.

Not sure. Feels doable, but it also feels like we're pushing into the AVM a responsibility that is very clearly the kernel's.

@iAmMichaelConnor
Copy link
Contributor Author

@spalladino Thanks for making these changes. I went through and wrote some comments, and asked some questions :)

@rahul-kothari
Copy link
Contributor

Some unsolicited feedback as I was learning about this:
Unclear why version and registerer_address and many of other fields are actually needed. For example - how is the version used/why is it needed? Is it for upgrades? If so how does upgrades work for classes.

Mat have missed it, but the section should probably cover account contracts and how they would work.

I haven't fully digested our forum discussion on this but it felt like that had some things that aren't mentioned in the YP 🤷

Btw Mike told me that "You can use a contract before deploying if it doesn't have a constructor" - this isn't immediately obvious after reading the YP. Whether it deserves to be in the YP is a different matter.

@spalladino
Copy link
Collaborator

Unclear why version and registerer_address and many of other fields are actually needed. For example - how is the version used/why is it needed? Is it for upgrades? If so how does upgrades work for classes.

Addressed in a new commit!

Mat have missed it, but the section should probably cover account contracts and how they would work.

Hmm disagree, since account contracts are deployed same as any other contract. Still, I'm not sure what's the best place for account contracts to be discussed in the yellow paper (assuming they have to be included).

I haven't fully digested our forum discussion on this but it felt like that had some things that aren't mentioned in the YP 🤷

Can you elaborate on which ones?

Btw Mike told me that "You can use a contract before deploying if it doesn't have a constructor" - this isn't immediately obvious after reading the YP. Whether it deserves to be in the YP is a different matter.

That's covered in the "Statuses" section of the instances doc, but added a few more sentiences making sure it's clear.

@spalladino spalladino merged commit 69b0933 into mc/yp-review-contract-deployment Jan 30, 2024
83 checks passed
@spalladino spalladino deleted the palla/yp-review-contract-deployment branch January 30, 2024 17:47
@spalladino spalladino restored the palla/yp-review-contract-deployment branch January 30, 2024 17:47
spalladino added a commit that referenced this pull request Jan 30, 2024
Please read [contributing guidelines](CONTRIBUTING.md) and remove this
line.

---------

Co-authored-by: Santiago Palladino <[email protected]>
@ludamad ludamad deleted the palla/yp-review-contract-deployment branch August 22, 2024 15:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

4 participants