Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[execution-pool] BlockRetrievalRequest struct -> enum #15812

Merged
merged 4 commits into from
Jan 28, 2025

Conversation

hariria
Copy link
Contributor

@hariria hariria commented Jan 25, 2025

Description

Deprecates the BlockRetrievalRequest struct in favor of a BlockRetrievalRequest enum. This is needed as execution pool plans to make BlockRetrievalRequests that include epoch and round information instead of target_block_id.

The plan is to phase this in three releases. The first release will introduce the new BlockRetrievalRequest enum and all corresponding structs / logic that use it. The second release will deprecate the BlockRetrievalRequestV1 struct and all related structs / logic.

Important

Some TODOs around deprecation are already marked in the code. Additional TODO in future PRs:

  • process_block_retrieval_v2 to be implemented in a future PR cc @bchocho @hariria
  • Once the new BlockRetrievalRequest enum is in the mainnet release, request_block should send ConsensusMsg::BlockRetrievalRequest instead of ConsensusMsg::DeprecatedBlockRetrievalRequest

Additional Questions

  • Why does consensus.yaml require TUPLEARRAY instead of TUPLE?

How Has This Been Tested?

@hariria hariria requested a review from bchocho January 25, 2025 03:59
Copy link

trunk-io bot commented Jan 25, 2025

⏱️ 5h 6m total CI duration on this PR
Slowest 15 Jobs Cumulative Duration Recent Runs
execution-performance / single-node-performance 2h 30m 🟩🟩🟥🟩🟩 (+2 more)
execution-performance / test-target-determinator 30m 🟩🟩🟩🟩🟩 (+1 more)
check-dynamic-deps 28m 🟩🟩🟩🟩🟩 (+6 more)
test-target-determinator 26m 🟩🟩🟩🟩 (+1 more)
rust-cargo-deny 18m 🟩🟩🟩🟩 (+6 more)
fetch-last-released-docker-image-tag 9m 🟩🟩🟩🟩 (+1 more)
rust-doc-tests 6m 🟩
rust-doc-tests 6m 🟩
rust-doc-tests 6m 🟩
rust-doc-tests 6m 🟩
general-lints 6m 🟩🟩🟩🟩 (+6 more)
rust-doc-tests 6m 🟩
semgrep/ci 4m 🟩🟩🟩🟩🟩 (+6 more)
file_change_determinator 2m 🟩🟩🟩🟩🟩 (+6 more)
file_change_determinator 1m 🟩🟩🟩🟩🟩 (+1 more)

settingsfeedbackdocs ⋅ learn more about trunk.io

@hariria hariria self-assigned this Jan 25, 2025
@hariria hariria force-pushed the andrew/exec-pool-block-retrieval-request branch 2 times, most recently from d98225a to be5e4c3 Compare January 25, 2025 04:19
@hariria hariria force-pushed the andrew/exec-pool-block-retrieval-request branch from be5e4c3 to 2c36388 Compare January 25, 2025 20:13
@hariria hariria marked this pull request as ready for review January 25, 2025 20:23
@hariria hariria changed the title [execution-pool] BlockRetrievalRequest deprecation [execution-pool] BlockRetrievalRequest struct -> enum Jan 25, 2025
pub struct BlockRetrievalRequestV2 {
block_id: HashValue,
num_blocks: u64,
// TODO: remove the Option, if it's not too painful
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, two things I now remember thinking to clean up:

  1. Now that we're separating out the logic, can we remove the Option? It doesn't make sense to have a V2 request without the target
  2. Is epoch needed here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Sounds good. For posterity in case we come back to this, the option and (epoch, round) was included from the original PR
  2. Will remove epoch and Option

@hariria hariria force-pushed the andrew/exec-pool-block-retrieval-request branch from 2c36388 to 902382e Compare January 27, 2025 19:16
@hariria hariria requested a review from bchocho January 27, 2025 19:16
BlockRetrievalRequest::V1(v1) => v1,
// TODO @bchocho @hariria implement after all nodes upgrade to release with enum BlockRetrievalRequest (not struct)
BlockRetrievalRequest::V2(_) => {
unimplemented!("Should not have received a BlockRetrievalRequestV2...")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better to log error with sender.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to

bail!("Should not have received a BlockRetrievalRequestV2 {} from peer with id {}", v2, peer_id);

BlockRetrieval(IncomingBlockRetrievalRequest),
/// NOTE: This is being phased out in two releases to accommodate `IncomingBlockRetrievalRequestV2`
/// TODO @bchocho @hariria can remove after all nodes upgrade to release with enum BlockRetrievalRequest (not struct)
DeprecatedBlockRetrieval(IncomingBlockRetrievalRequest),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering if we should append V0 to IncomingBlockRetrievalRequest and make it IncomingBlockRetrievalRequestV0. That way we can keep the new variant as BlockRetrieval instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to

#[derive(Debug)]
pub enum IncomingRpcRequest {
    /// NOTE: This is being phased out in two releases to accommodate `IncomingBlockRetrievalRequestV2`
    /// TODO @bchocho @hariria can remove after all nodes upgrade to release with enum BlockRetrievalRequest (not struct)
    DeprecatedBlockRetrieval(DeprecatedIncomingBlockRetrievalRequest),
    BatchRetrieval(IncomingBatchRetrievalRequest),
    DAGRequest(IncomingDAGRequest),
    CommitRequest(IncomingCommitRequest),
    RandGenRequest(IncomingRandGenRequest),
    BlockRetrieval(IncomingBlockRetrievalRequest),
}

@@ -83,14 +86,17 @@ pub enum ConsensusMsg {
OrderVoteMsg(Box<OrderVoteMsg>),
/// RoundTimeoutMsg is broadcasted by a validator once it decides to timeout the current round.
RoundTimeoutMsg(Box<RoundTimeoutMsg>),
/// RPC to get a chain of block of the given length starting from the given block id, using epoch and round.
BlockRetrievalRequestV2(Box<BlockRetrievalRequest>),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as for RPC request. V0 for the original variant and drop the V2 for the new variant?

Copy link
Contributor Author

@hariria hariria Jan 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to

/// Network type for consensus
#[derive(Clone, Debug, Deserialize, Serialize)]
pub enum ConsensusMsg {
    /// DEPRECATED: Please use [`ConsensusMsg::BlockRetrievalRequest`](ConsensusMsg::BlockRetrievalRequest) going forward
    /// This variant was renamed from `BlockRetrievalRequest` to `DeprecatedBlockRetrievalRequest`
    /// RPC to get a chain of block of the given length starting from the given block id.
    /// Note: Naming here is `BlockRetrievalRequest` and not `DeprecatedBlockRetrievalRequest`
    /// to be consistent with the naming implementation in [`ConsensusMsg::name`](ConsensusMsg::name)
    DeprecatedBlockRetrievalRequest(Box<BlockRetrievalRequestV1>),
    ...
    /// RPC to get a chain of block of the given length starting from the given block id, using epoch and round.
    BlockRetrievalRequest(Box<BlockRetrievalRequest>),
}

@ibalajiarun ibalajiarun added the CICD:run-e2e-tests when this label is present github actions will run all land-blocking e2e tests from the PR label Jan 27, 2025

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

@hariria hariria force-pushed the andrew/exec-pool-block-retrieval-request branch from d3516c9 to ca4499f Compare January 27, 2025 21:59

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

@hariria hariria requested a review from ibalajiarun January 27, 2025 22:28
Copy link
Contributor

@ibalajiarun ibalajiarun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed offline, let's implement process_block_retrieval_v2 in the next PR.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

continue;
}
if let Err(e) = monitor!(
"process_block_retrieval",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change to process_block_retrieval_v2 so we can see the change in metrics while rolling out

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@hariria hariria force-pushed the andrew/exec-pool-block-retrieval-request branch from 8389258 to 6bd46ac Compare January 28, 2025 18:27
@hariria hariria enabled auto-merge (squash) January 28, 2025 18:28

This comment has been minimized.

This comment has been minimized.

Copy link
Contributor

✅ Forge suite compat success on 55909aa300b714e37df1113a239332c334cb2896 ==> 6bd46ace468ed89d8838c61b1aa7bfa1cb2f8a23

Compatibility test results for 55909aa300b714e37df1113a239332c334cb2896 ==> 6bd46ace468ed89d8838c61b1aa7bfa1cb2f8a23 (PR)
1. Check liveness of validators at old version: 55909aa300b714e37df1113a239332c334cb2896
compatibility::simple-validator-upgrade::liveness-check : committed: 12240.24 txn/s, latency: 2590.93 ms, (p50: 2600 ms, p70: 2800, p90: 3100 ms, p99: 4200 ms), latency samples: 399220
2. Upgrading first Validator to new version: 6bd46ace468ed89d8838c61b1aa7bfa1cb2f8a23
compatibility::simple-validator-upgrade::single-validator-upgrading : committed: 4352.44 txn/s, latency: 7165.75 ms, (p50: 8000 ms, p70: 8500, p90: 8800 ms, p99: 9100 ms), latency samples: 92640
compatibility::simple-validator-upgrade::single-validator-upgrade : committed: 4409.31 txn/s, latency: 7700.57 ms, (p50: 8600 ms, p70: 8700, p90: 9100 ms, p99: 9100 ms), latency samples: 155420
3. Upgrading rest of first batch to new version: 6bd46ace468ed89d8838c61b1aa7bfa1cb2f8a23
compatibility::simple-validator-upgrade::half-validator-upgrading : committed: 3766.22 txn/s, latency: 8072.08 ms, (p50: 8600 ms, p70: 9600, p90: 10300 ms, p99: 10500 ms), latency samples: 82680
compatibility::simple-validator-upgrade::half-validator-upgrade : committed: 3732.33 txn/s, latency: 9205.33 ms, (p50: 10000 ms, p70: 10200, p90: 10500 ms, p99: 10600 ms), latency samples: 133700
4. upgrading second batch to new version: 6bd46ace468ed89d8838c61b1aa7bfa1cb2f8a23
compatibility::simple-validator-upgrade::rest-validator-upgrading : committed: 7701.94 txn/s, latency: 3948.86 ms, (p50: 4600 ms, p70: 4800, p90: 5000 ms, p99: 5300 ms), latency samples: 143820
compatibility::simple-validator-upgrade::rest-validator-upgrade : committed: 7687.54 txn/s, latency: 4463.75 ms, (p50: 4900 ms, p70: 5000, p90: 5300 ms, p99: 5400 ms), latency samples: 256960
5. check swarm health
Compatibility test for 55909aa300b714e37df1113a239332c334cb2896 ==> 6bd46ace468ed89d8838c61b1aa7bfa1cb2f8a23 passed
Test Ok

Copy link
Contributor

✅ Forge suite realistic_env_max_load success on 6bd46ace468ed89d8838c61b1aa7bfa1cb2f8a23

two traffics test: inner traffic : committed: 14808.21 txn/s, latency: 2675.19 ms, (p50: 2700 ms, p70: 2700, p90: 3000 ms, p99: 3900 ms), latency samples: 5630420
two traffics test : committed: 99.99 txn/s, latency: 1406.88 ms, (p50: 1400 ms, p70: 1400, p90: 1600 ms, p99: 2700 ms), latency samples: 1780
Latency breakdown for phase 0: ["MempoolToBlockCreation: max: 1.452, avg: 1.414", "ConsensusProposalToOrdered: max: 0.298, avg: 0.288", "ConsensusOrderedToCommit: max: 0.410, avg: 0.401", "ConsensusProposalToCommit: max: 0.699, avg: 0.689"]
Max non-epoch-change gap was: 0 rounds at version 0 (avg 0.00) [limit 4], 1.52s no progress at version 66242 (avg 0.19s) [limit 15].
Max epoch-change gap was: 0 rounds at version 0 (avg 0.00) [limit 4], 0.65s no progress at version 5905605 (avg 0.64s) [limit 16].
Test Ok

@hariria hariria merged commit 4e1eb2d into main Jan 28, 2025
67 of 89 checks passed
@hariria hariria deleted the andrew/exec-pool-block-retrieval-request branch January 28, 2025 21:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CICD:run-e2e-tests when this label is present github actions will run all land-blocking e2e tests from the PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants