Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate batch kernel parts from miden-node #1112

Open
wants to merge 29 commits into
base: next
Choose a base branch
from

Conversation

PhilippGackstatter
Copy link
Contributor

@PhilippGackstatter PhilippGackstatter commented Jan 28, 2025

Migrates the TransactionBatch and related types from miden-node to miden-base.

Mostly ready for review and tested.

A bunch of TODOs are left, mostly around documentation and on types that I'm not yet sure we will need in this form (NoteInclusionProof, BlockInclusionProof, ...) so I didn't want to document them yet.

Unrelated: Removes BlockNumber::from_usize which we don't need and is dangerous.

@PhilippGackstatter PhilippGackstatter force-pushed the pgackst-batch-prover-crate branch from 8eeced0 to 921c6d9 Compare January 28, 2025 15:42
Comment on lines 477 to 488
/// Test that an authenticated input note that is also created in the same batch does not error
/// and instead is marked as consumed.
/// - This requires a nullifier collision on the input and output note which is very unlikely in
/// practice.
/// - This makes the created note unspendable as its nullifier is added to the nullifier tree.
/// - The batch kernel cannot return an error in this case as it can't detect this condition due
/// to only having the nullifier for authenticated input notes _but_ not having the nullifier
/// for private output notes.
/// - We test this to ensure the kernel does something reasonable in this case and it is not an
/// attack vector.
#[test]
fn authenticated_note_created_in_same_batch() -> anyhow::Result<()> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a pretty tricky case I want to call out. This describes the best way I can think of to handle it and the rationale, but not sure if there's a better way.

Comment on lines +15 to +19
// TODO: Should this really be a Blake3 hash? We have to compute this in the block kernel
// eventually, so we'd probably want RPO instead?
// TODO: Compute batch ID as hash over tx ID _and_ account ID.
#[derive(Debug, Copy, Clone, Eq, Ord, PartialEq, PartialOrd)]
pub struct BatchId(Blake3Digest<32>);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it was just a temporary thing to use a Blake3 hash for the Batch ID, so I have yet to update this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we should switch to RPO. One potential concern is the performance of RPO vs. BLAKE3 - but I think until we are building hundreds of batches per second, the hashing cost should be negligible.

Comment on lines +5 to +7
/// A proposed batch of transactions with all necessary data to validate it.
#[derive(Debug, Clone)]
pub struct ProposedBatch {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ProposedBatch and ProvenBatch sound nice together, but I wonder, for consistency with other input types like TransactionInputs or BlockInputs, whether this should actually be called BatchInputs?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also try to go into the opposite direction. May work for ProposedBlock but not sure about ProposedTransaction.

@PhilippGackstatter PhilippGackstatter marked this pull request as ready for review January 28, 2025 15:49
@PhilippGackstatter
Copy link
Contributor Author

@bobbinth I think this is ready for an initial review, but I still expect to change some things (BatchId computation and verifying the proofs of ProvenTransactions in the batch). I think it might still be beneficial to get an impression of whether this is the direction you were thinking of as well.

Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thank you! Not a full review, but I left some comments inline. Overall, this is pretty much how I was thinking about it as well.

Comment on lines +5 to +7
/// A proposed batch of transactions with all necessary data to validate it.
#[derive(Debug, Clone)]
pub struct ProposedBatch {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also try to go into the opposite direction. May work for ProposedBlock but not sure about ProposedTransaction.

crates/miden-objects/src/batch/proposed_batch.rs Outdated Show resolved Hide resolved
crates/miden-objects/src/batch/proposed_batch.rs Outdated Show resolved Hide resolved
Comment on lines +15 to +19
// TODO: Should this really be a Blake3 hash? We have to compute this in the block kernel
// eventually, so we'd probably want RPO instead?
// TODO: Compute batch ID as hash over tx ID _and_ account ID.
#[derive(Debug, Copy, Clone, Eq, Ord, PartialEq, PartialOrd)]
pub struct BatchId(Blake3Digest<32>);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we should switch to RPO. One potential concern is the performance of RPO vs. BLAKE3 - but I think until we are building hundreds of batches per second, the hashing cost should be negligible.

crates/miden-objects/src/batch/batch_id.rs Show resolved Hide resolved
pub struct ProvenBatch {
id: BatchId,
account_updates: BTreeMap<AccountId, BatchAccountUpdate>,
input_notes: Vec<InputNoteCommitment>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use InputNotes<InputNoteCommitment> here instead of Vec<InputNoteCommitment>?

crates/miden-batch-prover/src/local_batch_prover.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Thank you! I reviewed pretty much all non-testing code and left some small comments inline.

Before merging this, would be good to create a "parallel" PR in the miden-node repo which integrates this functionality into the node.

Also, we should probably create a good set of tests for the construction of the ProposedBatch - but this is probably best done in a different PR.

Comment on lines +23 to +24
/// The header is boxed as it has a large stack size.
block_header: Box<BlockHeader>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would we still need Box here after #1076? If not, maybe we should quickly do #1076 first?

Comment on lines +29 to +31
/// The note inclusion proofs for unauthenticated notes that were consumed in the batch which
/// can be authenticated.
authenticatable_unauthenticated_notes: BTreeMap<NoteId, NoteInclusionProof>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the unauthenticated_note_proofs alternative better.

Comment on lines +98 to +101
/// Returns an iterator over the block headers in this chain MMR.
pub fn block_headers_iter(&self) -> impl Iterator<Item = &BlockHeader> {
self.blocks.values()
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I would probably name this just block_headers().

Comment on lines +56 to +57
/// - Another transaction in the batch produces the unauthenticated input note, in which
/// case inclusion in the chain must not be proven.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would maybe write this as: "Another transaction in the batch produces an output note matching an unauthenticated input note, in which case inclusion in the chain does not need to be proven.".

Comment on lines +148 to +154
let batch_account_update = BatchAccountUpdate::new(
tx.account_id(),
tx.account_update().init_state_hash(),
tx.account_update().final_state_hash(),
vec![tx.id()],
tx.account_update().details().clone(),
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we move this logic into something like BatchAccountUpdate::from_transaction()?

Comment on lines +199 to +204
// Remove all output notes from the batch output note set that are consumed by transactions.
//
// This still allows transaction `A` to consume an unauthenticated note `x` and output note
// `y` and for transaction `B` to consume an unauthenticated note `y` and output
// note `x` (i.e., have a circular dependency between transactions).
let mut output_notes = BatchOutputNoteTracker::new(transactions.iter().map(AsRef::as_ref))?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also makes sure that there are no duplicate output notes, right? If so, maybe worth mentioning this in the comments?

Comment on lines +1 to +2
[package]
name = "miden-batch-prover"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should name this miden-tx-batch-prover to indicate that we are proving batches of transactions.

Comment on lines +33 to +44
anyhow = { version = "1.0", default-features = false, optional = true }
assembly = { workspace = true }
miden-crypto = { workspace = true }
miden-lib = { workspace = true, optional = true, features = ["testing"] }
miden-tx = { workspace = true }
miden-objects = { workspace = true }
miden-verifier = { workspace = true }
rand = { workspace = true, optional = true, features = ["small_rng"] }
thiserror = { workspace = true }
vm-core = { workspace = true }
vm-processor = { workspace = true }
winterfell = { version = "0.11", optional = true }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we double-check if all of these are needed as dependencies or if some could go into dev-dependencies? (e.g., upon a quick look I didn't see if we are using rand in non-testing code).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants