Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: SHA256 #1022

Merged
merged 31 commits into from
Dec 31, 2024
Merged

feat: SHA256 #1022

merged 31 commits into from
Dec 31, 2024

Conversation

arayikhalatyan
Copy link
Contributor

@arayikhalatyan arayikhalatyan commented Dec 12, 2024

Resolves INT-1726
Resolves INT-1748
Resolves INT-2580

This PR implements the Sha256 extension and Sha256-Air primitive.
Using interactions to constrain the message schedule turned out to be too inefficient (doing 34 interactions per row). Current implementation uses 470 columns x 17 rows = 7990 cells per block and does 17 interactions every row. I believe the chosen approach was indeed the best one among the 4 discussed options in the linear ticket.
The first 16 rows are round rows and the last one is a digest row. Reading the message happens in the first 4 rows of every block. Final hash computing happens on the 17th (digest) row of every block. Register reads and output write happens on the very last row of every message. The subair utilizes interactions with itself to constrain some values that are 'passed' from a digest row to the next digest row. The padding is completely handled by the Vm chip.

To compare with other implementations: halo2 uses about 7200 cells per block but they are able to access any row from local and not just the next. Also, they are allowing constraints up to degree 4. Considering the additional limitations our implementation does pretty well compared to the halo2's implementation.

Possible optimizations: It is possible to reduce the number of interactions by 8 per row by adding 16 columns (work var carries). I am not sure if this is worth doing after GKR changes?


Copilot generated:
This pull request introduces significant updates to the Cargo.toml files and the Encoder struct, as well as the addition of a new SHA256 implementation. The most important changes include adding new SHA256-related dependencies and modules, modifying the Encoder struct to include a new field, and implementing the SHA256 AIR (Algebraic Intermediate Representation) components.

Dependency and Module Additions:

  • Cargo.toml: Added new SHA256-related dependencies and modules, including openvm-sha256-air, openvm-sha256-circuit, openvm-sha256-transpiler, and openvm-sha256-guest. [1] [2] [3] [4]

Encoder Struct Modifications:

SHA256 AIR Implementation:

  • crates/circuits/sha256-air/: Added a new module for SHA256 AIR implementation, including Cargo.toml, lib.rs, columns.rs, mod.rs, and utils.rs files. This includes defining the structure and functions necessary for the SHA256 AIR. [1] [2] [3] [4] [5]

SDK Configuration Updates:

  • crates/sdk/src/config/global.rs: Updated the SDK configuration to include SHA256 components in the SdkVmConfig, SdkVmConfigExecutor, and SdkVmConfigPeriphery enums, and modified the impl SdkVmConfig to support SHA256 transpiler extension. [1] [2] [3] [4] [5]

Copy link

linear bot commented Dec 12, 2024

INT-1748 SHA2-256 Vm Hasher Air

Integration of sha256 air into VM with interactions to read/write from memory that supports variable length byte inputs.

This will not use integration API, but can follow the structure of the KeccakVmChip and use the INT-1726 as a SubAir.

INT-1726 SHA2-256 Air

Motivation

Currently in-circuit we only have the BabyBear poseidon2 hash implementation.

For use in rollup state for an indexed merkle tree, we want to use sha2-256 (henceforth referred to as sha256) since it is not field dependent and also more friendly for on-chain / non-ZK usage.

Moreover, a second motivation is that as described here there is a good chance that we may want to use sha256 as the native hash for STARKs and use it in the aggregation circuits as well.

Sha2 is a very common hash function that is required in many forms of signature verification and signing protocols.

Implementation

Ultimately, we will want a hasher chip similar to KeccakVmChip.

However we begin with a sha256-air standalone implementation, which should be a separate crate similar to poseidon2-air

Background

SHA256 hashes variable length arrays of bits, but it does it by processing 512=16*32 bits (64 bytes) blocks at a time. For domain separation of variable length arrays, it always does a special padding after the array to both domain separate and make the length a multiple of 512 bits. (There are some arguments that the padding is unnecessary for the internal node hashes of a Merkle tree as an additional optimization, but we will stick with the version where all hashes must be padded to conform to existing rust implementations. For reference, the hash without padding is referred to as copy_from_slice(); whereas the padded one is CryptographicHasher.)

AIR

We want an AIR (or set of AIRs) that can do the SHA256 hash with padding — this is commonly split into at least two AIRs, one that does it without padding (the compress part) and one that handles padding.

For now we do not care about separating the sha256 compress from padding and we just want something that can handle the variable length hash with padding. (We may want to refactor later to have a standalone compress for Merkle tree implementations as mentioned above.)

halo2 reference

We want to implement an AIR resembling this halo2 implementation of sha256 which handles the compress and padding together. This reference implementation requires 72 rows for each 512-bit block (SHA256 has 64 rounds per block).

The main distinction between halo2 and AIRs is that in halo2, you can have constraints where the local row references a row which is a fixed i rows before or after. In AIRs, currently the local row can only reference the next row 1 row after.

plonky3

We still want one row for each of the 64 rounds, but round i will need access to w[i-15], w[i-2], w[i-16], w[i-7]. Because of the restriction of next row only in AIRs, there are (at least) two ways to address this: (this are just suggestions, if you see a better way, please propose it!)

  • Option 1: in round i, store a sliding window of 16 32-bit words w[i-16..i-1] (stored as 32 16-bit field elements). Only bit decompose w[i-15] and w[i-2] for the purpose of rightrotate (with additional columns).
    • Use the subair pattern for bit decomposition and rightrotate — these can be put in chips crate (perhaps in a new module bits)
  • Option 2: use interactions between the Air and itself to send round i-15 to round i , etc.
    • Overally complicated and since we need to send i-15, i-2, i-7 the cost overhead of interactions is close to 32 columns anyways.

Verdict: We can try Option 1 unless there are unforeseen issues.

We may need an implementation using even fewer columns for the final STARK aggregation, but that will be scoped separately.

Reference

The pseudocode in the wikipedia article is quite good as a reference: https://en.wikipedia.org/wiki/SHA-2#Pseudocode

For reference, the analog for keccak256 hash (SHA3) is:

sp1's SHA256 AIR implementation is here: https://github.com/succinctlabs/sp1/tree/main/core/src/syscall/precompiles/sha256

INT-2580 Add sha2-256 intrinsic

transpiler and axvm::intrinsics::hashes

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

self.eval_reads(builder);
self.eval_last_row(builder);

self.sha256_subair.eval(builder, SHA256VM_CONTROL_WIDTH);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking maybe we would want to make a SubInteractionBuilder trait like we have a SubAirBuilder trait. It would this part cleaner but I'm not sure if SubInteractionBuilder actually makes sense

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This trait doesn't allow the use of interactions inside the subair

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah all we need is to implement InteractionBuilder on SubAirBuilder. SubAirBuilder isn't a trait, it's just a struct.

Copy link
Contributor

@jonathanpwang jonathanpwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finished reviewing just sha256-air (not the VM part). The AIR constraints look correct and sound - I just have some comments & suggestions there to improve readability and make it easier to audit.

Two major requests:

  1. The sha256-air crate has no standalone tests. All tests are in the VM part. It's best to have some tests in this crate to both show usage of the API and also so that we can unit test just this AIR without reliance on the VM part, which is important to catch potential errors in future refactors.
  2. The trace generation needs to be refactored / slightly modified to ensure that we can easily parallelize trace generation across blocks. It is important that we don't need to single thread the trace generation of the entire matrix. It doesn't seem too big a change; you can handle padding blocks separately (but also potentially parallelized) and if necessary at the end modify the boundary row between non-padding and padding.

crates/circuits/sha256-air/src/sha256/utils.rs Outdated Show resolved Hide resolved
crates/circuits/sha256-air/src/sha256/utils.rs Outdated Show resolved Hide resolved
crates/circuits/sha256-air/src/sha256/air.rs Outdated Show resolved Hide resolved
crates/circuits/sha256-air/src/sha256/air.rs Outdated Show resolved Hide resolved
crates/circuits/sha256-air/src/sha256/air.rs Outdated Show resolved Hide resolved
crates/circuits/sha256-air/src/sha256/trace.rs Outdated Show resolved Hide resolved
crates/circuits/sha256-air/src/lib.rs Outdated Show resolved Hide resolved
crates/circuits/sha256-air/src/sha256/trace.rs Outdated Show resolved Hide resolved
crates/circuits/sha256-air/src/sha256/trace.rs Outdated Show resolved Hide resolved
crates/circuits/sha256-air/src/sha256/trace.rs Outdated Show resolved Hide resolved
@yi-sun
Copy link
Collaborator

yi-sun commented Dec 18, 2024

Let's also remember to add this to the book.

extensions/sha256/circuit/src/sha256_chip/mod.rs Outdated Show resolved Hide resolved
extensions/sha256/circuit/src/sha256_chip/mod.rs Outdated Show resolved Hide resolved
extensions/sha256/circuit/src/sha256_chip/mod.rs Outdated Show resolved Hide resolved
extensions/sha256/circuit/src/sha256_chip/air.rs Outdated Show resolved Hide resolved
docs/specs/ISA.md Show resolved Hide resolved
crates/circuits/sha256-air/src/sha256/air.rs Outdated Show resolved Hide resolved
crates/circuits/sha256-air/src/sha256/air.rs Outdated Show resolved Hide resolved

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

Copy link
Contributor

@jonathanpwang jonathanpwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link

Benchmarks

group app_log_blowup app_total_cells_used app_total_cycles app_total_proof_time_ms leaf_log_blowup leaf_total_cells_used leaf_total_cycles leaf_total_proof_time_ms max_segment_length instance alloc
ecrecover_program
2
15,230,037
290,016
(-6.0 [-0.3%])
2,391.0
-
-
-
-
1048476 64cpu-linux-arm64 mimalloc
fibonacci_program
2
51,505,102
1,500,137
(+22.0 [+0.4%])
5,498.0
-
-
-
-
1048476 64cpu-linux-arm64 mimalloc
regex_program
2
165,028,173
4,190,904
(+54.0 [+0.3%])
15,866.0
-
-
-
-
1048476 64cpu-linux-arm64 mimalloc
verify_fibair
2
(+250 [+0.0%])
8,121,424
(+4 [+0.0%])
195,361
(-22.0 [-1.5%])
1,430.0
-
-
-
-
1048476 64cpu-linux-arm64 mimalloc

Commit: 9babcc7

Benchmark Workflow

@jonathanpwang jonathanpwang merged commit 48ecf39 into main Dec 31, 2024
19 checks passed
@jonathanpwang jonathanpwang deleted the feat/sha2 branch December 31, 2024 08:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants