-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: SHA256 #1022
feat: SHA256 #1022
Conversation
INT-1748 SHA2-256 Vm Hasher Air
Integration of sha256 air into VM with interactions to read/write from memory that supports variable length byte inputs. This will not use integration API, but can follow the structure of the INT-1726 SHA2-256 Air
Motivation
Sha2 is a very common hash function that is required in many forms of signature verification and signing protocols. ImplementationUltimately, we will want a hasher chip similar to However we begin with a BackgroundSHA256 hashes variable length arrays of bits, but it does it by processing 512=16*32 bits (64 bytes) blocks at a time. For domain separation of variable length arrays, it always does a special padding after the array to both domain separate and make the length a multiple of 512 bits. (There are some arguments that the padding is unnecessary for the internal node hashes of a Merkle tree as an additional optimization, but we will stick with the version where all hashes must be padded to conform to existing rust implementations. For reference, the hash without padding is referred to as copy_from_slice(); whereas the padded one is CryptographicHasher.) AIRWe want an AIR (or set of AIRs) that can do the SHA256 hash with padding — this is commonly split into at least two AIRs, one that does it without padding (the For now we do not care about separating the sha256 halo2 referenceWe want to implement an AIR resembling this halo2 implementation of sha256 which handles the compress and padding together. This reference implementation requires 72 rows for each 512-bit block (SHA256 has 64 rounds per block).
The main distinction between halo2 and AIRs is that in halo2, you can have constraints where the local row references a row which is a fixed plonky3We still want one row for each of the 64 rounds, but round
Verdict: We can try Option 1 unless there are unforeseen issues. We may need an implementation using even fewer columns for the final STARK aggregation, but that will be scoped separately. ReferenceThe pseudocode in the wikipedia article is quite good as a reference: https://en.wikipedia.org/wiki/SHA-2#Pseudocode For reference, the analog for keccak256 hash (SHA3) is:
sp1's SHA256 AIR implementation is here: https://github.com/succinctlabs/sp1/tree/main/core/src/syscall/precompiles/sha256 |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
37915d6
to
61e6fd1
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
18f598e
to
d8151df
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
4e97f14
to
69cb843
Compare
This comment has been minimized.
This comment has been minimized.
self.eval_reads(builder); | ||
self.eval_last_row(builder); | ||
|
||
self.sha256_subair.eval(builder, SHA256VM_CONTROL_WIDTH); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking maybe we would want to make a SubInteractionBuilder
trait like we have a SubAirBuilder
trait. It would this part cleaner but I'm not sure if SubInteractionBuilder
actually makes sense
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This trait doesn't allow the use of interactions inside the subair
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah all we need is to implement InteractionBuilder
on SubAirBuilder
. SubAirBuilder
isn't a trait, it's just a struct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finished reviewing just sha256-air (not the VM part). The AIR constraints look correct and sound - I just have some comments & suggestions there to improve readability and make it easier to audit.
Two major requests:
- The sha256-air crate has no standalone tests. All tests are in the VM part. It's best to have some tests in this crate to both show usage of the API and also so that we can unit test just this AIR without reliance on the VM part, which is important to catch potential errors in future refactors.
- The trace generation needs to be refactored / slightly modified to ensure that we can easily parallelize trace generation across blocks. It is important that we don't need to single thread the trace generation of the entire matrix. It doesn't seem too big a change; you can handle padding blocks separately (but also potentially parallelized) and if necessary at the end modify the boundary row between non-padding and padding.
Let's also remember to add this to the book. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Benchmarks
Commit: 9babcc7 |
Resolves INT-1726
Resolves INT-1748
Resolves INT-2580
This PR implements the
Sha256
extension andSha256-Air
primitive.Using interactions to constrain the message schedule turned out to be too inefficient (doing 34 interactions per row). Current implementation uses 470 columns x 17 rows = 7990 cells per block and does 17 interactions every row. I believe the chosen approach was indeed the best one among the 4 discussed options in the linear ticket.
The first 16 rows are round rows and the last one is a digest row. Reading the message happens in the first 4 rows of every block. Final hash computing happens on the 17th (digest) row of every block. Register reads and output write happens on the very last row of every message. The subair utilizes interactions with itself to constrain some values that are 'passed' from a digest row to the next digest row. The padding is completely handled by the Vm chip.
To compare with other implementations: halo2 uses about 7200 cells per block but they are able to access any row from local and not just the next. Also, they are allowing constraints up to degree 4. Considering the additional limitations our implementation does pretty well compared to the halo2's implementation.
Possible optimizations: It is possible to reduce the number of interactions by 8 per row by adding 16 columns (work var carries). I am not sure if this is worth doing after
GKR
changes?Copilot generated:
This pull request introduces significant updates to the
Cargo.toml
files and theEncoder
struct, as well as the addition of a new SHA256 implementation. The most important changes include adding new SHA256-related dependencies and modules, modifying theEncoder
struct to include a new field, and implementing the SHA256 AIR (Algebraic Intermediate Representation) components.Dependency and Module Additions:
Cargo.toml
: Added new SHA256-related dependencies and modules, includingopenvm-sha256-air
,openvm-sha256-circuit
,openvm-sha256-transpiler
, andopenvm-sha256-guest
. [1] [2] [3] [4]Encoder Struct Modifications:
crates/circuits/primitives/src/encoder/mod.rs
: Added a new boolean fieldreserve_invalid
to theEncoder
struct and updated the constructor and methods to handle this new field. [1] [2] [3] [4]SHA256 AIR Implementation:
crates/circuits/sha256-air/
: Added a new module for SHA256 AIR implementation, includingCargo.toml
,lib.rs
,columns.rs
,mod.rs
, andutils.rs
files. This includes defining the structure and functions necessary for the SHA256 AIR. [1] [2] [3] [4] [5]SDK Configuration Updates:
crates/sdk/src/config/global.rs
: Updated the SDK configuration to include SHA256 components in theSdkVmConfig
,SdkVmConfigExecutor
, andSdkVmConfigPeriphery
enums, and modified theimpl SdkVmConfig
to support SHA256 transpiler extension. [1] [2] [3] [4] [5]