-
Notifications
You must be signed in to change notification settings - Fork 244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compile time blowup when calling push_back repeatedly #4395
Comments
Updates the `ContractDeployer` to use the new deployment flow. Instead of creating txs flagged as `isDeployment`, the deployer now creates a batch call that registers the contract class (if needed), publicly deploys the contract via the canonical `InstanceDeployer`, and then initializes it by calling the constructor. This batch call is then sent via an account contract entrypoint and executed in a single tx. Check out `e2e_deploy` _publicly deploys and initializes a contract_ to see it in action. However, since this requires sending txs through an account entrypoint, it breaks for deploying accounts. So we still keep the old deployment flow around for those scenarios, under the `LegacyDeployMethod` name. We'll need to tweak account contracts constructors so they can be initialized and optionally deployed in the same single direct call, and they also pay fees for it. Alternatively, we can also add a canonical `MultiCall` contract, which could take care of this, though fee payment is unclear. This PR also changes the `ClassRegisterer` API so that bytecode is provided via a capsule instead of an arg, since having args of size over ~2k elements was causing a loop in the Hasher that massively increased build times (see noir-lang/noir#4395). **Breaking changes:** Since deployments are now done from an account, deploying a contract via `Contract.deploy` now requires a `Wallet` instance instead of a `PXE`, and deploy a contract via CLI now requires a `--private-key`. --------- Co-authored-by: PhilWindle <[email protected]>
We optimized the implementation of You could, for example rewrite the example to make use of fn main(expected: pub Field, first: Field, input: [Field;8192]) {
let mut hasher_slice = input.as_slice();
hasher_slice = hasher_slice.push_front(first);
assert(hasher_slice[0] == expected);
} takes 0.644 seconds for me on
pub fn add_multiple<N>(&mut self, fields: [Field; N]) {
for i in 0..N {
self.fields = self.fields.push_back(fields[i]);
}
} with the existing pub fn add_multiple<N>(&mut self, fields: [Field; N]) {
self.fields = self.fields.append(fields.as_slice());
} This won't give you a speedup yet, but when we eventually optimize |
Execute was ~.9s for me as well using this update |
@jfecher is anyone working on this bug? I appreciate that a workaround exists for @spalladino , but other developers will also encounter this issue. |
@zac-williamson I spoke with @vezenovm about adding a builtin for |
@jfecher I've just tried the approach you suggested (apologies for the delay!) in this branch in aztec-packages, but I'm running into a |
@spalladino what version of the compiler are you using? There was a bug that was fixed a couple days ago related to |
@jfecher no luck, same error as before (though the index has moved a little). I've pushed the latest to the same branch as before: AztecProtocol/aztec-packages#5410.
|
Another issue. On master, there's been a large regression. The compiler now panics for the original code example with:
I think the issue stems from the slice created by the
In the SSA above, the Edit: This is due a bug in the array/slice split. Edit 2: I think the array/slice split just exposed this bug rather than introduced it. It looks like the coercion code hasn't been changed in quite a while. Before the split, the array in this issue would just be inferred to be a slice instead of being coerced into one. |
Probably can be resolved with an improved type check. I am looking at resolving the initial bug but I can take a look at this one as well. |
@vezenovm I fixed it and am putting up a PR for it now after I make a new issue. It looks like the array to slice coercion has always been broken which makes me question how it ever worked. |
I don't think we ever allowed empty arrays before. So if you did |
We allowed empty arrays but arrays and slices were polymorphic before so in that case before it'd just be inferred to be a slice instead of coerced into one. |
@spalladino may be worth testing again after #4639 is merged |
He didn't use the |
@vezenovm I was assuming there still may be a |
That PR doesn't use the macro which hashes the args. I have a fix. |
# Description ## Problem\* Resolves #4395 (comment) ## Summary\* We were not accurately accounting for situations where the slice capacity tracker was expecting a capacity from slice intrinsic results. I have added a case for handling the slice capacity of `AsSlice`. I also tested this against the `aztec-packages` work and we can successfully compile now. ## Additional Context ## Documentation\* Check one: - [ ] No documentation needed. - [ ] Documentation included in this PR. - [ ] **[For Experimental Features]** Documentation to be submitted in a separate PR. # PR Checklist\* - [ ] I have tested the changes locally. - [ ] I have formatted the changes with [Prettier](https://prettier.io/) and/or `cargo fmt` on default settings.
@jfecher @vezenovm can confirm compile time has gone down from 7 minutes to just 30 seconds using However, given a recent increase in max bytecode size, we now have to increase the This means going from hashing 4k field elements to hashing 16k, and that causes compile times to spike again, going from 30s to 5 minutes. Is this expected, or is there something further we can do to improve those times? |
Commenting out the pedersen_hash inside the loop does not speed things up, so it's the copying that's taking up time. |
@vezenovm possibly. When investigating #4621 further for the hashmap usecase though I found that it didn't actually apply there - the last_uses check already applied to every array set and made them all mutable even with dynamic indices. I think I'll push a PR from a branch I made which turns the last_use analysis into an actual SSA pass to make it more clear for debugging which instructions it applies to. I'm also not sure if adding reference counting to acir is plausible in general or if we'd need a static analysis like the last_use we have now. If there was a minimal repro for the aztec-packages PR we could test to see how many array sets are made mutable. |
This looks to just be an issue when flattening with such a large loop bound. I made the following vanilla Noir program: use dep::std::hash::pedersen_hash;
global ARGS_HASH_CHUNK_COUNT = 64;
global ARGS_HASH_CHUNK_LENGTH = 64;
fn main(expected: pub Field, first: Field, input: [Field; 20]) {
let mut hasher_slice = input.as_slice();
hasher_slice = hasher_slice.push_front(first);
assert(hasher_slice[0] == expected);
let hash = hash_args_slice(hasher_slice);
println(hash);
}
fn hash_args_slice(args: [Field]) -> Field {
if args.len() == 0 {
0
} else {
let mut chunks_hashes = [0; ARGS_HASH_CHUNK_COUNT];
for i in 0..ARGS_HASH_CHUNK_COUNT {
let mut chunk_hash = 0;
let start_chunk_index = i * ARGS_HASH_CHUNK_LENGTH;
if start_chunk_index < args.len() {
let mut chunk_args = [0; ARGS_HASH_CHUNK_LENGTH];
for j in 0..ARGS_HASH_CHUNK_LENGTH {
let item_index = i * ARGS_HASH_CHUNK_LENGTH + j;
if item_index < args.len() {
chunk_args[j] = args[item_index];
}
}
chunk_hash = pedersen_hash(chunk_args);
}
chunks_hashes[i] = chunk_hash;
}
pedersen_hash(chunks_hashes)
}
} Running Switching the loop bound constants to 128 the compile time jumped to ~2 minutes. I also made an example without any if statements in the loop. fn hash_args_slice_no_if(args: [Field]) -> Field {
if args.len() == 0 {
0
} else {
let mut chunks_hashes = [0; ARGS_HASH_CHUNK_COUNT];
for i in 0..ARGS_HASH_CHUNK_COUNT {
let mut chunk_hash = 0;
let mut chunk_args = [0; ARGS_HASH_CHUNK_LENGTH];
for j in 0..ARGS_HASH_CHUNK_LENGTH {
let item_index = i * ARGS_HASH_CHUNK_LENGTH + j;
chunk_args[j] = args[item_index];
}
chunk_hash = pedersen_hash(chunk_args);
chunks_hashes[i] = chunk_hash;
}
pedersen_hash(chunks_hashes)
}
} I then made |
It would be good to add some benchmarking on certain tests to see how much time we are spending on each compilation pass. |
It worries me if the reason for this slowdown is just from a large nested loop since that isn't really fixable. We can always try to optimize these passes further but even if we get the entire compiler running in half the time that'd just mean we'd compile just as slowly for a loop bound 2x higher. |
I did some basic timing using
And then loop bounds of 128:
|
I'm assuming you changed both loop bounds to 128. Can you change only 1 so that we're comparing |
Changing just
Changing just
|
Looks like both are growing faster than linearly. mem2reg and flattening take up the most time, with mem2reg growing a bit faster. Generally the passes take longer as the SSA grows towards the end after loop unrolling and flattening. Is loop unrolling included in another pass or just excluded from these timings? |
Ah good catch I was just printing inside of |
I did it quickly and loop unrolling is only ~200 ms in both cases above. |
Hold on, so the main culprit is the if statements? Given I know the size of the args array at compile time, do you think I can somehow work around this? If I changed the input from a slice to a |
Yeah they should be. As you do know the size of the args array at compile time, another option is we could add that builtin |
Just to be clear, the issue is not the user-friendliness of the syntax, but the increase in the compilation time. |
# Description ## Problem\* Resolves #4395 (comment) ## Summary\* We were not accurately accounting for situations where the slice capacity tracker was expecting a capacity from slice intrinsic results. I have added a case for handling the slice capacity of `AsSlice`. I also tested this against the `aztec-packages` work and we can successfully compile now. ## Additional Context ## Documentation\* Check one: - [ ] No documentation needed. - [ ] Documentation included in this PR. - [ ] **[For Experimental Features]** Documentation to be submitted in a separate PR. # PR Checklist\* - [ ] I have tested the changes locally. - [ ] I have formatted the changes with [Prettier](https://prettier.io/) and/or `cargo fmt` on default settings.
Aim
Given a large enough array (about 8k elements), copying it into another array by iteratively calling
push_back
increases compiling time to several minutes (over 5 minutes).This happens when declaring a very large array as an argument for an Aztec contract, since they get passed into the Hasher which copies elements into its own array using
push_back
.https://github.com/AztecProtocol/aztec-packages/blob/f14474571210a46e7159cb9d2f0bc9374a837d3d/noir-projects/aztec-nr/aztec/src/hasher.nr#L22-L26
Expected Behavior
Compile time should not blow up.
Bug
The following vanilla Noir snippet reproduces the issue:
To Reproduce
No response
Project Impact
None
Impact Context
This makes it very annoying to add large arrays as arguments to Aztec contracts.
Workaround
None
Workaround Description
We can work around it by injecting the array through an oracle call. Still, if there is an equivalent operation to
push_back
that would yield better results, it'd be most welcome!Additional Context
No response
Installation Method
Binary (
noirup
default)Nargo Version
0.23.0+5be9f9d7e2f39ca228df10e5a530474af0331704
NoirJS Version
No response
Would you like to submit a PR for this Issue?
None
Support Needs
No response
The text was updated successfully, but these errors were encountered: