-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(tee): TEE Prover Gateway #2270
Conversation
Signed-off-by: Harald Hoyer <[email protected]>
@haraldh @popzxc @RomanBrodetski, this PR isn't fully polished yet. It needs better error handling, less copy-pasting, and the performance optimizations mentioned below (unless you decide to refactor it completely). However, the code works and gets the job done. Before refining it further, I need some advice to make sure it's heading in the right direction. The main changes are in The idea behind this PR is best described here, and here: zksync-era/prover/prover_tee_gateway/src/main.rs Lines 52 to 70 in 372d854
I tested it locally using our SGX-enabled machines in
(notice the ~10-second time gap between State of the SQL db after running the above command:
My main concerns/doubts around my code changes:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this should be implemented as a part of the prover workspace. Prover has its own reasons for design decisions, and a bunch of technical debt to address, so reusing its code should have strong reasons for it. Here used abstractions don't seem to be suitable (which is indicated, for example, by the fact that you use ()
as job ID, TeeProofGenerationDataRequest
is an empty structure, and get_next_request
always return Some(..)
).
Here the code looks fairly simple, so we probably can implement a stand-alone binary for it. Using node framework will allow you to reuse some of the existing layers.
For example, you may do something like this (pseudocode):
use zksync_node_framework::task::Task;
struct TeeProver { /* some fields */ }
#[async_trait::async_trait]
impl Task for TeeProver {
fn id() -> TaskId {
"tee_prover".into()
}
fn run(self: Box<Self>, stop_receiver: StopReceiver) -> anyhow::Result<()> {
self.register_attestation(&mut stop_receiver).await?;
while !stop_receiver.0.borrow() {
let job = self.get_job(&mut stop_receiver).await?; // Wait until the job is available.
let output = self.verify(job, &mut stop_receiver).await?; // Generate the proof.
self.submit_proof(output, &mut stop_receiver).await?; // Submit the proof.
}
Ok(())
}
}
Then you can implement a wiring layer for it, and the resulting binary will be composed as:
fn main() -> anyhow::Result<()> {
/* setup observability, load configs */
ZkStackServiceBuilder::new()
.add_layer(sigint_layer) // handle sigint
.add_layer(prometheus_exporter_layer) // export metrics
.add_layer(tee_prover_layer) // run the prover
.build()?
.run()
}
This binary will live in core/bin
and IMHO it will be significantly easier to read.
Thanks for the suggestions @popzxc! I have a quick question, as I'm not yet fluent with the node framework and I want to provide more context on how we want to run the TEE Prover Gateway. Does the node framework make it easy to build a standalone binary, decoupled from the (We need cc @haraldh |
@pbeza the node framework is not tied to the server -- the last code snippet I provided is a sample for a stand-alone binary (which has nothing to do with the server). |
The code changes needed for this PR are too major to continue work on it. Closing it and going with the simpler #2333 instead. |
What ❔
The TEE Prover Gateway is a service component within our system infrastructure that functions as an intermediary between the TEE enclave and the server's HTTP API, introduced in commit eca98cc (#1993). It first registers TEE attestation using the
/tee/register_attestation
endpoint, then regularly invokes the server's HTTP API via the/tee/proof_inputs
endpoint to obtain proof-related data, and finally submits the proof through the/tee/submit_proofs/<l1_batch_number>
endpoint.Why ❔
This PR contributes to the effort outlined in the docs:
Checklist
zk fmt
andzk lint
.zk spellcheck
.