Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add verifying key and proving key serialization #103

Merged

Conversation

jonathanpwang
Copy link

This PR adds features for reading and writing the verifying key and proving key of any circuit from/to file.

For verifying key, this uses zcash#661 which fixes the issue with vkey write not working after selector compression.

For proving key, this uses serde derive macros and bincode.

Both vkey and pkey serializations have been tested on very large circuits (up to K = 23), including circuits created from plonk-verifier. There have been no problems even up to pkey of sizes > 100GB.

Copy link

@han0110 han0110 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall look good to me! Just with some more proper error propagation I think it's good to merge.

I feel like next we can try to serialize the ConstraintSystem with selector compressed in VerifyingKey, then we don't need to store selector values in vk and the verifier doesn't even need to know the concrete circuit but can still verify the proof.

halo2_proofs/src/helpers.rs Outdated Show resolved Hide resolved
halo2_proofs/src/poly.rs Outdated Show resolved Hide resolved
halo2_proofs/src/plonk.rs Outdated Show resolved Hide resolved
halo2_proofs/src/plonk/evaluation.rs Outdated Show resolved Hide resolved
halo2_proofs/src/plonk.rs Outdated Show resolved Hide resolved
halo2_proofs/src/plonk.rs Outdated Show resolved Hide resolved
halo2_proofs/src/plonk.rs Outdated Show resolved Hide resolved
* add `plonk::Error::Serde` to pass through serialization errors
@jonathanpwang
Copy link
Author

Implemented @han0110 's suggestions.

Copy link
Member

@CPerezz CPerezz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit concerned about introducing all of the serde stack here. Why do we go for serde rather than a simpler to/from_bytes and maybe also a side read/write in case you prefer it??

Serde comes with A BUNCH of dependencies and it forces anything that is using this as a dependency to also use serde.

I would say that since the entire crate is done without using it, I'd follow the same rationale and structure for these structs. read/write and to/from_bytes might suit better.

I say that because it's annoying to use a lib as dependency (as halo2) and need to use serde just for 2 structures in it.

Any opinions @han0110 @kilic ?

fixed_cosets: self.fixed_cosets.clone(),
permutation: self.permutation.clone(),
};
bincode::serialize_into(writer, &partial_pkey).map_err(Error::Serde)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcoding bincode directly here is a bit tough TBH. But I understand that you can't use serde traits because that could allow yaml or anything esentially.

As said in my review comment, I'm a bit hessitant over the serde.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously because of evaluation there was too much to implement write for directly, but since Han pointed out that we can recompute evaluation directly, it seems the only primitive necessary is to implement write for Polynomial<F, _> = Vec<F> and Vec<Polynomial<F, _>>, which seems doable.

So if you want to limit dependencies, I think it is possible to just implement some write and read directly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer it and avoid serde. Unless it is extremely painful.
If any of Han or Onur agree with me, I'd suggest to change it.

@CPerezz
Copy link
Member

CPerezz commented Nov 23, 2022

Hey @jonathanpwang any news with that??

@jonathanpwang
Copy link
Author

Hey @jonathanpwang any news with that??

Ah sorry I've been a bit busy. It may take a little longer (a week or two) before I get around to this, but just confirming I will try to do the serialization without serde soon.

@CPerezz CPerezz added the enhancement New feature or request label Nov 24, 2022
@jonathanpwang
Copy link
Author

Okay I think I implemented the direct implementation of read/write by just serializing vectors in the obvious way. Also added an example for basic testing.

Copy link
Member

@CPerezz CPerezz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much for the work on refactoring a bit this @jonathanpwang !

Just some comments and also 2 points I want to consider.

  1. Could you also provide a to/from_bytes fn for ProverKey & VerifierKey??
  2. The example you added feels extremely large for what we need to show which is how to use serde. I'm not even unsure if an example is needed TBH. But if we want it, I think we should use the smallest possible circuit and not have any comments for it. As at the end, we want to focus the example on the serialisation, not the circuit.

halo2_proofs/src/plonk.rs Outdated Show resolved Hide resolved
halo2_proofs/src/plonk.rs Outdated Show resolved Hide resolved
halo2_proofs/src/plonk/permutation.rs Outdated Show resolved Hide resolved
halo2_proofs/src/plonk/permutation.rs Outdated Show resolved Hide resolved
halo2_proofs/src/plonk/permutation.rs Outdated Show resolved Hide resolved
* add `pack`/`unpack` helper functions between `&[bool]` and `u8`
* made serialization example use smaller example of standard plonk for
  less code bloat
@jonathanpwang
Copy link
Author

I added to/from_bytes functions, though I was slightly lazy and just used the Read and Write traits for &[u8] and Vec<u8> respectively. For to_bytes I did optimize to avoid memory reallocation by having the vector's total capacity determined beforehand.

Added pack/unpack functions as you suggested.

Moved helper functions to helper.rs.

Made the example smaller using a standard plonk example. I think it's still good to have the example just to have basic skeleton code for people to copy. Also it's helpful for testing.

Copy link
Member

@CPerezz CPerezz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!!!! Running the workflows now!

Thanks for the work and the corrections @jonathanpwang 😄

@CPerezz CPerezz merged commit 246ce4e into privacy-scaling-explorations:main Dec 2, 2022
@jonathanpwang
Copy link
Author

I realized why reading the pkey was still kind of slow for >1GB proving keys: currently the CurveAffine implementation of to_bytes stores the point in compressed format, so when you read it you need to compute a cube and then square root to get the affine representation.

Additionally, the ff::PrimeField::to_repr implementation stores the normal base representation of the field element, and when you read it it does a Montgomery conversion back to Montgomery form.

I would prefer to just print out both CurveAffine and PrimeField in raw Montgomery form of the coordinates so that serialization/deserialization is only limited in speed by IO. This would double the size of the curve points, but the bulk of the proving key should be from the polynomials anyways. This would involve adding some traits + implementations such as to_raw_bytes and from_raw_bytes_unchecked to halo2curves.

WDYT @CPerezz @han0110 ?

@han0110
Copy link

han0110 commented Dec 6, 2022

I see, now I understand why some zkp library encodes field element in montgomery form and curve element in uncompressed form.

I think it will be cool to have such feature implemented as trait in halo2curves, since it won't be breaking changes (and it'd be nice to have both from_raw_bytes_unchecked and from_raw_bytes implemented).

As for halo2 pk/vk serialization, using from_raw_bytes_unchecked as default serialization seems a little bit scary for me, perhaps we can use from_raw_bytes as default, and implement another read_unchecked for pk/vk? Or if checking the field element in range is much cheaper than doing montgomery mul, and checking curve element is on curve is much cheaper than doing sqrt, perhaps we don't need from_raw_bytes_unchecked?

@jonathanpwang
Copy link
Author

OK I will try to implement functions like to/from_raw_bytes_*. I agree we can keep the checked versions for everything (curve points not compressed, field elements in montgomery form), since it does not affect the actual serialization format. Then users can decide how unsafe they want to be 😈 on their own.

@CPerezz
Copy link
Member

CPerezz commented Dec 8, 2022

I completelly agree with the idea @jonathanpwang @han0110 .
I've used this in arkworks a lot with CanonicalSerialize for example. Avoiding the checks while reading curve elements and giving up disk space in respect of faster reads.

Filled privacy-scaling-explorations/halo2curves#9 to track this!

@daira
Copy link

daira commented May 1, 2023

Note that zcash/zcash#661 has been updated to use a leading version byte; little-endian byte order; and extra length fields for the selectors and permutation commitments. It's therefore no longer compatible with the format in this PR. However, the leading version byte is 0x01 which will make the two formats distinguishable, whenever the number of fixed column commitments is less than $2^{24}$ (see zcash#661 (comment) ).

@CPerezz
Copy link
Member

CPerezz commented May 2, 2023

Thanks @daira for the ping! <3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

5 participants