Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix round manager tests #15369

Merged
merged 5 commits into from
Dec 10, 2024
Merged

Fix round manager tests #15369

merged 5 commits into from
Dec 10, 2024

Conversation

vusirikala
Copy link
Contributor

@vusirikala vusirikala commented Nov 22, 2024

Description

The round manager tests are outdated. They work only when broadcast_vote and order_vote flags are disabled, which doesn't actually match with the mainnet settings. This PR aims to address the issue by fixing the round manager tests.

How Has This Been Tested?

Key Areas to Review

Type of Change

  • New feature
  • Bug fix
  • Breaking change
  • Performance improvement
  • Refactoring
  • Dependency update
  • Documentation update
  • Tests

Which Components or Systems Does This Change Impact?

  • Validator Node
  • Full Node (API, Indexer, etc.)
  • Move/Aptos Virtual Machine
  • Aptos Framework
  • Aptos CLI/SDK
  • Developer Infrastructure
  • Move Compiler
  • Other (specify)

Checklist

  • I have read and followed the CONTRIBUTING doc
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I identified and added all stakeholders and component owners affected by this change as reviewers
  • I tested both happy and unhappy path of the functionality
  • I have made corresponding changes to the documentation

Copy link

trunk-io bot commented Nov 22, 2024

⏱️ 17m total CI duration on this PR
Job Cumulative Duration Recent Runs
rust-move-tests 13m 🟩
rust-cargo-deny 2m 🟩
check-dynamic-deps 1m 🟩
general-lints 29s 🟩
semgrep/ci 27s 🟩
file_change_determinator 11s 🟩
permission-check 4s 🟩
permission-check 2s 🟩

settingsfeedbackdocs ⋅ learn more about trunk.io

@zekun000
Copy link
Contributor

zekun000 commented Dec 2, 2024

seems some tests failing?

@vusirikala vusirikala requested review from danielxiangzl and removed request for sasha8 December 7, 2024 00:50
@vusirikala vusirikala changed the title [Draft] Fix round manager tests Fix round manager tests Dec 7, 2024
@vusirikala vusirikala enabled auto-merge (squash) December 7, 2024 00:56

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

@vusirikala
Copy link
Contributor Author

seems some tests failing?

Which tests are failing? All forge tests seem to pass?

@@ -317,7 +324,7 @@ impl NodeSetup {
let (round_manager_tx, _) = aptos_channel::new(QueueStyle::LIFO, 1, None);

let mut local_config = local_consensus_config.clone();
local_config.enable_broadcast_vote(false);
local_config.enable_broadcast_vote(true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this the default? We should remove this if so. That way test can fail if someone tried to change the config.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The default is true. Removed this statement.

},
_ => panic!("unexpected network message {:?}", next_message),
}
// let next_message = timed_block_on(&runtime, nodes[proposal_node].next_network_message());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this commented out? If this test is incomplete, can we add a unimplemented! here instead so it can break when someone tries to run it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. Actually, the test is flaky and there is already #[ignore] tag added on top of the test. The test doesn't affect forge.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant if someone tries to remove the #[ignore] runs and thinks that the test passes because parts of code is commented out.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

Copy link
Contributor

✅ Forge suite compat success on 3c6e693a27339e73520f41030dce8fc9cd504967 ==> 49a09539a905d5f4d1c562d7f4a7fe040472007e

Compatibility test results for 3c6e693a27339e73520f41030dce8fc9cd504967 ==> 49a09539a905d5f4d1c562d7f4a7fe040472007e (PR)
1. Check liveness of validators at old version: 3c6e693a27339e73520f41030dce8fc9cd504967
compatibility::simple-validator-upgrade::liveness-check : committed: 16397.91 txn/s, latency: 2055.88 ms, (p50: 1900 ms, p70: 2100, p90: 2800 ms, p99: 4800 ms), latency samples: 541280
2. Upgrading first Validator to new version: 49a09539a905d5f4d1c562d7f4a7fe040472007e
compatibility::simple-validator-upgrade::single-validator-upgrading : committed: 7405.44 txn/s, latency: 3862.25 ms, (p50: 4300 ms, p70: 4600, p90: 4700 ms, p99: 4700 ms), latency samples: 137140
compatibility::simple-validator-upgrade::single-validator-upgrade : committed: 7481.58 txn/s, latency: 4338.31 ms, (p50: 4700 ms, p70: 4700, p90: 4700 ms, p99: 4900 ms), latency samples: 251180
3. Upgrading rest of first batch to new version: 49a09539a905d5f4d1c562d7f4a7fe040472007e
compatibility::simple-validator-upgrade::half-validator-upgrading : committed: 6821.80 txn/s, latency: 4068.14 ms, (p50: 4600 ms, p70: 4700, p90: 4900 ms, p99: 5000 ms), latency samples: 132320
compatibility::simple-validator-upgrade::half-validator-upgrade : committed: 7256.57 txn/s, latency: 4482.04 ms, (p50: 4800 ms, p70: 4900, p90: 5000 ms, p99: 5200 ms), latency samples: 240360
4. upgrading second batch to new version: 49a09539a905d5f4d1c562d7f4a7fe040472007e
compatibility::simple-validator-upgrade::rest-validator-upgrading : committed: 10898.40 txn/s, latency: 2570.08 ms, (p50: 2800 ms, p70: 3000, p90: 3100 ms, p99: 3500 ms), latency samples: 189180
compatibility::simple-validator-upgrade::rest-validator-upgrade : committed: 11491.69 txn/s, latency: 2773.24 ms, (p50: 2800 ms, p70: 3100, p90: 3200 ms, p99: 3500 ms), latency samples: 374400
5. check swarm health
Compatibility test for 3c6e693a27339e73520f41030dce8fc9cd504967 ==> 49a09539a905d5f4d1c562d7f4a7fe040472007e passed
Test Ok

Copy link
Contributor

✅ Forge suite realistic_env_max_load success on 49a09539a905d5f4d1c562d7f4a7fe040472007e

two traffics test: inner traffic : committed: 14459.86 txn/s, latency: 2745.67 ms, (p50: 2700 ms, p70: 2700, p90: 3000 ms, p99: 3300 ms), latency samples: 5498040
two traffics test : committed: 100.08 txn/s, latency: 1458.88 ms, (p50: 1400 ms, p70: 1500, p90: 1600 ms, p99: 2200 ms), latency samples: 1820
Latency breakdown for phase 0: ["MempoolToBlockCreation: max: 1.590, avg: 1.504", "ConsensusProposalToOrdered: max: 0.332, avg: 0.297", "ConsensusOrderedToCommit: max: 0.375, avg: 0.361", "ConsensusProposalToCommit: max: 0.669, avg: 0.658"]
Max non-epoch-change gap was: 0 rounds at version 0 (avg 0.00) [limit 4], 0.90s no progress at version 34083 (avg 0.21s) [limit 15].
Max epoch-change gap was: 0 rounds at version 0 (avg 0.00) [limit 4], 0.64s no progress at version 2072014 (avg 0.62s) [limit 16].
Test Ok

Copy link
Contributor

✅ Forge suite framework_upgrade success on 3c6e693a27339e73520f41030dce8fc9cd504967 ==> 49a09539a905d5f4d1c562d7f4a7fe040472007e

Compatibility test results for 3c6e693a27339e73520f41030dce8fc9cd504967 ==> 49a09539a905d5f4d1c562d7f4a7fe040472007e (PR)
Upgrade the nodes to version: 49a09539a905d5f4d1c562d7f4a7fe040472007e
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1313.59 txn/s, submitted: 1317.88 txn/s, failed submission: 4.29 txn/s, expired: 4.29 txn/s, latency: 2289.18 ms, (p50: 2100 ms, p70: 2400, p90: 3300 ms, p99: 5700 ms), latency samples: 116460
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1515.79 txn/s, submitted: 1517.17 txn/s, failed submission: 1.38 txn/s, expired: 1.38 txn/s, latency: 2045.93 ms, (p50: 2100 ms, p70: 2100, p90: 2700 ms, p99: 3600 ms), latency samples: 132060
5. check swarm health
Compatibility test for 3c6e693a27339e73520f41030dce8fc9cd504967 ==> 49a09539a905d5f4d1c562d7f4a7fe040472007e passed
Upgrade the remaining nodes to version: 49a09539a905d5f4d1c562d7f4a7fe040472007e
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1431.59 txn/s, submitted: 1434.89 txn/s, failed submission: 3.30 txn/s, expired: 3.30 txn/s, latency: 2122.21 ms, (p50: 2100 ms, p70: 2100, p90: 3000 ms, p99: 4200 ms), latency samples: 130100
Test Ok

@vusirikala vusirikala merged commit 7edaeaf into main Dec 10, 2024
46 checks passed
@vusirikala vusirikala deleted the satya/fix_round_manager_tests branch December 10, 2024 20:31
danielxiangzl pushed a commit that referenced this pull request Dec 12, 2024
danielxiangzl pushed a commit that referenced this pull request Dec 12, 2024
georgemitenkov pushed a commit that referenced this pull request Jan 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants