Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[move-stdlib] Use vector::move_range inside vector, and evaluate performance / calibrate gas #14862

Merged
merged 8 commits into from
Dec 6, 2024

Conversation

igor-aptos
Copy link
Contributor

@igor-aptos igor-aptos commented Oct 3, 2024

Description

Use vector::move_range inside of vector, to optimize insert, remove, append, trim.
Extend aptos-move/e2e-benchmark/src/main.rs to track gas and gas/s, to allow for quick calibration.
Adding workloads to txn-emitter to be able to use it throughput.

Additionally add a missing replace method, which replaces value at particular index.

Running on extended set of params:
https://gist.github.com/igor-aptos/e8d4e21edcbc75dddcb9382d4e077665

Summary of the performance tests:

  • performance doesn't depend on the size of the values (unless they are primitive), as vector modifies only pointers to them.
  • operation depends very little on how many elements we need to move - moving 1000 elements (i.e. to insert into a vector 1000 elements from the end) is only (!!) 2x slower than moving 1 element, end-to-end.

For gas calibration, on the variety of workloads, current implementation has decent variance. After tuning params to match the averages, variance seems much smaller.

Results table:

walltime(us)  expected(us) dif(- is impr)    exe gas/s       exe gas        io gas  entry point
       32918       6274903        -99.5%         10102     332557214             0  VectorSplitOffAppend { vec_len: 3000, element_len: 1, index: 100, repeats: 1000 }
       34050       4207452        -99.2%          9241     314656814             0  VectorRemoveInsert { vec_len: 3000, element_len: 1, index: 100, repeats: 1000 }
        5349          5372         -0.4%          8393      44895214             0  VectorSplitOffAppend { vec_len: 1000, element_len: 1, index: 999, repeats: 0 }
       40810       3914489        -99.0%          6882     280871214             0  VectorSplitOffAppend { vec_len: 1000, element_len: 1, index: 100, repeats: 2000 }
       22697       1915373        -98.8%          7176     162883214             0  VectorSplitOffAppend { vec_len: 1000, element_len: 1, index: 100, repeats: 1000 }
       15190        647029        -97.7%          9143     138883214             0  VectorSplitOffAppend { vec_len: 1000, element_len: 1, index: 700, repeats: 1000 }
       14927         44053        -66.1%          8527     127283214             0  VectorSplitOffAppend { vec_len: 1000, element_len: 1, index: 990, repeats: 1000 }
       14597         28187        -48.2%          8700     127003214             0  VectorSplitOffAppend { vec_len: 1000, element_len: 1, index: 997, repeats: 1000 }
       14469         23651        -38.8%          8772     126923214             0  VectorSplitOffAppend { vec_len: 1000, element_len: 1, index: 999, repeats: 1000 }
       20047       1277945        -98.4%          9227     184982814             0  VectorRemoveInsert { vec_len: 1000, element_len: 1, index: 100, repeats: 1000 }
       19543        433826        -95.5%          8851     172982814             0  VectorRemoveInsert { vec_len: 1000, element_len: 1, index: 700, repeats: 1000 }
       19510        153190        -87.3%          8661     168982814             0  VectorRemoveInsert { vec_len: 1000, element_len: 1, index: 900, repeats: 1000 }
       19451         27317        -28.8%          8595     167182814             0  VectorRemoveInsert { vec_len: 1000, element_len: 1, index: 990, repeats: 1000 }
       19684         21268         -7.4%          8489     167102814             0  VectorRemoveInsert { vec_len: 1000, element_len: 1, index: 994, repeats: 1000 }
       19859         18214          9.0%          8412     167062814             0  VectorRemoveInsert { vec_len: 1000, element_len: 1, index: 996, repeats: 1000 }
       18044         16704          8.0%          9407     169742814             0  VectorRemoveInsert { vec_len: 1000, element_len: 1, index: 997, repeats: 1000 }
       16252         15433          5.3%          9169     149020814             0  VectorRemoveInsert { vec_len: 1000, element_len: 1, index: 998, repeats: 1000 }
       14664         13718          6.9%          8749     128298814             0  VectorRemoveInsert { vec_len: 1000, element_len: 1, index: 999, repeats: 1000 }
       17477         17498         -0.1%          9237     161449337             0  VectorRangeMove { vec_len: 1000, element_len: 1, index: 100, move_len: 100, repeats: 1000 }
       20020         19939          0.4%          8464     169449337             0  VectorRangeMove { vec_len: 1000, element_len: 1, index: 100, move_len: 500, repeats: 1000 }
       15886         15766          0.8%          8652     137449337             0  VectorRangeMove { vec_len: 1000, element_len: 1, index: 700, move_len: 100, repeats: 1000 }
       15331         14973          2.4%          8143     124849337             0  VectorRangeMove { vec_len: 1000, element_len: 1, index: 970, move_len: 10, repeats: 1000 }
         607           611         -0.7%         10806       6559534             0  VectorSplitOffAppend { vec_len: 100, element_len: 100, index: 9, repeats: 0 }
       10139        207995        -95.1%          9088      92147534             0  VectorSplitOffAppend { vec_len: 100, element_len: 100, index: 10, repeats: 1000 }
        9855         82231        -88.0%          9106      89747534             0  VectorSplitOffAppend { vec_len: 100, element_len: 100, index: 70, repeats: 1000 }
        9853         24017        -59.0%          8999      88667534             0  VectorSplitOffAppend { vec_len: 100, element_len: 100, index: 97, repeats: 1000 }
       14262        135926        -89.5%          9146     130447134             0  VectorRemoveInsert { vec_len: 100, element_len: 100, index: 10, repeats: 1000 }
       14364         49920        -71.2%          8997     129247134             0  VectorRemoveInsert { vec_len: 100, element_len: 100, index: 70, repeats: 1000 }
       14488         22374        -35.2%          8893     128847134             0  VectorRemoveInsert { vec_len: 100, element_len: 100, index: 90, repeats: 1000 }
       14245         16599        -14.2%          9039     128767134             0  VectorRemoveInsert { vec_len: 100, element_len: 100, index: 94, repeats: 1000 }
       14361         13426          7.0%          8963     128727134             0  VectorRemoveInsert { vec_len: 100, element_len: 100, index: 96, repeats: 1000 }
       12559         11889          5.6%         10463     131407134             0  VectorRemoveInsert { vec_len: 100, element_len: 100, index: 97, repeats: 1000 }
       10977         10555          4.0%         10083     110685134             0  VectorRemoveInsert { vec_len: 100, element_len: 100, index: 98, repeats: 1000 }
        9592          8814          8.8%          9378      89963134             0  VectorRemoveInsert { vec_len: 100, element_len: 100, index: 99, repeats: 1000 }
        5497          5567         -1.3%          9201      50577977             0  VectorRangeMove { vec_len: 100, element_len: 100, index: 10, move_len: 10, repeats: 1000 }
        5488          5679         -3.4%          9361      51377977             0  VectorRangeMove { vec_len: 100, element_len: 100, index: 10, move_len: 50, repeats: 1000 }
        5186          5367         -3.4%          9290      48177977             0  VectorRangeMove { vec_len: 100, element_len: 100, index: 70, move_len: 10, repeats: 1000 }
        5174          5295         -2.3%          9087      47017977             0  VectorRangeMove { vec_len: 100, element_len: 100, index: 95, move_len: 2, repeats: 1000 }

How Has This Been Tested?

Key Areas to Review

Type of Change

  • Performance improvement

Which Components or Systems Does This Change Impact?

  • Aptos Framework

Checklist

  • I have read and followed the CONTRIBUTING doc
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I identified and added all stakeholders and component owners affected by this change as reviewers
  • I tested both happy and unhappy path of the functionality
  • I have made corresponding changes to the documentation

Copy link

trunk-io bot commented Oct 3, 2024

⏱️ 9h 26m total CI duration on this PR
Slowest 15 Jobs Cumulative Duration Recent Runs
execution-performance / single-node-performance 4h 48m 🟥🟥🟥🟥🟥 (+3 more)
execution-performance / test-target-determinator 25m 🟩🟩🟩🟩🟩 (+2 more)
test-target-determinator 23m 🟩🟩🟩🟩🟩 (+2 more)
rust-move-unit-coverage 16m 🟩
rust-move-unit-coverage 16m 🟩
check-dynamic-deps 16m 🟩🟩🟩🟩🟩 (+5 more)
rust-move-unit-coverage 15m 🟩
rust-cargo-deny 14m 🟩🟩🟩🟩🟩 (+3 more)
rust-move-unit-coverage 11m 🟩
rust-move-unit-coverage 10m 🟩
rust-move-unit-coverage 10m 🟩
rust-move-tests 10m 🟥
rust-move-tests 10m 🟥
rust-move-tests 10m 🟩
rust-move-unit-coverage 10m 🟩

🚨 1 job on the last run was significantly faster/slower than expected

Job Duration vs 7d avg Delta
execution-performance / single-node-performance 41m 16m +149%

settingsfeedbackdocs ⋅ learn more about trunk.io

Copy link

codecov bot commented Oct 3, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Please upload report for BASE (igor/native_vector_move_range@b36187b). Learn more about missing BASE report.

Additional details and impacted files
@@                       Coverage Diff                        @@
##             igor/native_vector_move_range   #14862   +/-   ##
================================================================
  Coverage                                 ?    60.1%           
================================================================
  Files                                    ?      858           
  Lines                                    ?   211455           
  Branches                                 ?        0           
================================================================
  Hits                                     ?   127237           
  Misses                                   ?    84218           
  Partials                                 ?        0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@igor-aptos igor-aptos force-pushed the igor/native_vector_move_range branch from 75ebf46 to d724063 Compare October 4, 2024 16:59
@igor-aptos igor-aptos force-pushed the igor/use_native_vector_move_range branch from 0df908c to 0fab436 Compare October 4, 2024 17:00
@igor-aptos igor-aptos added CICD:run-execution-performance-test Run execution performance test CICD:run-execution-performance-full-test Run execution performance test (full version) labels Oct 4, 2024
@igor-aptos igor-aptos force-pushed the igor/use_native_vector_move_range branch from 0fab436 to b1a2e70 Compare October 4, 2024 22:36
@igor-aptos igor-aptos force-pushed the igor/native_vector_move_range branch from d724063 to e4540db Compare October 8, 2024 19:00
@igor-aptos igor-aptos force-pushed the igor/use_native_vector_move_range branch from b1a2e70 to 6b493cd Compare October 8, 2024 19:00
@igor-aptos igor-aptos force-pushed the igor/native_vector_move_range branch from e4540db to ae8e817 Compare October 9, 2024 20:30
@igor-aptos igor-aptos force-pushed the igor/use_native_vector_move_range branch from 6b493cd to 6046016 Compare October 9, 2024 20:30
@igor-aptos igor-aptos force-pushed the igor/native_vector_move_range branch from ae8e817 to 48df5f1 Compare October 10, 2024 00:07
@igor-aptos igor-aptos force-pushed the igor/use_native_vector_move_range branch from 6046016 to c5e50b3 Compare October 10, 2024 00:08
@igor-aptos igor-aptos force-pushed the igor/native_vector_move_range branch from 48df5f1 to 1407c00 Compare October 10, 2024 18:09
@igor-aptos igor-aptos force-pushed the igor/use_native_vector_move_range branch from c5e50b3 to 36f8618 Compare October 10, 2024 18:10
@igor-aptos igor-aptos force-pushed the igor/native_vector_move_range branch from 1407c00 to 94d1b68 Compare October 15, 2024 19:48
@igor-aptos igor-aptos force-pushed the igor/use_native_vector_move_range branch from 36f8618 to 9851f51 Compare October 15, 2024 19:48
@igor-aptos igor-aptos force-pushed the igor/native_vector_move_range branch from 94d1b68 to 51f1f1c Compare October 16, 2024 21:07

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

@igor-aptos igor-aptos force-pushed the igor/use_native_vector_move_range branch from afa6bd2 to 0fe0e4c Compare December 6, 2024 00:50

This comment has been minimized.

@igor-aptos igor-aptos enabled auto-merge (squash) December 6, 2024 01:09

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

Copy link
Contributor

github-actions bot commented Dec 6, 2024

✅ Forge suite compat success on 17f4b41fb7157192dd4980b292843d84c518ea70 ==> 0fe0e4c005c371eecb5f6687015856c8a5c8a988

Compatibility test results for 17f4b41fb7157192dd4980b292843d84c518ea70 ==> 0fe0e4c005c371eecb5f6687015856c8a5c8a988 (PR)
1. Check liveness of validators at old version: 17f4b41fb7157192dd4980b292843d84c518ea70
compatibility::simple-validator-upgrade::liveness-check : committed: 17161.14 txn/s, latency: 1980.37 ms, (p50: 2100 ms, p70: 2100, p90: 2200 ms, p99: 2700 ms), latency samples: 556040
2. Upgrading first Validator to new version: 0fe0e4c005c371eecb5f6687015856c8a5c8a988
compatibility::simple-validator-upgrade::single-validator-upgrading : committed: 7200.35 txn/s, latency: 3910.86 ms, (p50: 4400 ms, p70: 4700, p90: 5200 ms, p99: 5300 ms), latency samples: 128960
compatibility::simple-validator-upgrade::single-validator-upgrade : committed: 7540.38 txn/s, latency: 4338.53 ms, (p50: 4700 ms, p70: 4800, p90: 5000 ms, p99: 5100 ms), latency samples: 251480
3. Upgrading rest of first batch to new version: 0fe0e4c005c371eecb5f6687015856c8a5c8a988
compatibility::simple-validator-upgrade::half-validator-upgrading : committed: 2202.63 txn/s, latency: 13050.69 ms, (p50: 13900 ms, p70: 17400, p90: 18500 ms, p99: 18600 ms), latency samples: 56020
compatibility::simple-validator-upgrade::half-validator-upgrade : committed: 6925.99 txn/s, latency: 4707.31 ms, (p50: 5000 ms, p70: 5200, p90: 5300 ms, p99: 5900 ms), latency samples: 232760
4. upgrading second batch to new version: 0fe0e4c005c371eecb5f6687015856c8a5c8a988
compatibility::simple-validator-upgrade::rest-validator-upgrading : committed: 12170.89 txn/s, latency: 2280.57 ms, (p50: 2500 ms, p70: 2600, p90: 2700 ms, p99: 2800 ms), latency samples: 211120
compatibility::simple-validator-upgrade::rest-validator-upgrade : committed: 12498.54 txn/s, latency: 2547.35 ms, (p50: 2600 ms, p70: 2700, p90: 3000 ms, p99: 3200 ms), latency samples: 405000
5. check swarm health
Compatibility test for 17f4b41fb7157192dd4980b292843d84c518ea70 ==> 0fe0e4c005c371eecb5f6687015856c8a5c8a988 passed
Test Ok

Copy link
Contributor

github-actions bot commented Dec 6, 2024

✅ Forge suite realistic_env_max_load success on 0fe0e4c005c371eecb5f6687015856c8a5c8a988

two traffics test: inner traffic : committed: 14669.34 txn/s, latency: 2707.34 ms, (p50: 2700 ms, p70: 2700, p90: 3000 ms, p99: 3300 ms), latency samples: 5577600
two traffics test : committed: 100.03 txn/s, latency: 1396.44 ms, (p50: 1400 ms, p70: 1400, p90: 1500 ms, p99: 1700 ms), latency samples: 1760
Latency breakdown for phase 0: ["MempoolToBlockCreation: max: 1.565, avg: 1.504", "ConsensusProposalToOrdered: max: 0.310, avg: 0.291", "ConsensusOrderedToCommit: max: 0.375, avg: 0.365", "ConsensusProposalToCommit: max: 0.664, avg: 0.656"]
Max non-epoch-change gap was: 0 rounds at version 0 (avg 0.00) [limit 4], 0.72s no progress at version 22381 (avg 0.20s) [limit 15].
Max epoch-change gap was: 0 rounds at version 0 (avg 0.00) [limit 4], 0.67s no progress at version 2743151 (avg 0.67s) [limit 16].
Test Ok

Copy link
Contributor

github-actions bot commented Dec 6, 2024

✅ Forge suite framework_upgrade success on 17f4b41fb7157192dd4980b292843d84c518ea70 ==> 0fe0e4c005c371eecb5f6687015856c8a5c8a988

Compatibility test results for 17f4b41fb7157192dd4980b292843d84c518ea70 ==> 0fe0e4c005c371eecb5f6687015856c8a5c8a988 (PR)
Upgrade the nodes to version: 0fe0e4c005c371eecb5f6687015856c8a5c8a988
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1508.13 txn/s, submitted: 1512.11 txn/s, failed submission: 3.98 txn/s, expired: 3.98 txn/s, latency: 1952.63 ms, (p50: 1800 ms, p70: 2100, p90: 2400 ms, p99: 3700 ms), latency samples: 136280
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1471.20 txn/s, submitted: 1474.54 txn/s, failed submission: 3.34 txn/s, expired: 3.34 txn/s, latency: 2030.28 ms, (p50: 2100 ms, p70: 2100, p90: 2400 ms, p99: 3800 ms), latency samples: 132220
5. check swarm health
Compatibility test for 17f4b41fb7157192dd4980b292843d84c518ea70 ==> 0fe0e4c005c371eecb5f6687015856c8a5c8a988 passed
Upgrade the remaining nodes to version: 0fe0e4c005c371eecb5f6687015856c8a5c8a988
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1397.88 txn/s, submitted: 1400.40 txn/s, failed submission: 2.52 txn/s, expired: 2.52 txn/s, latency: 2206.32 ms, (p50: 2100 ms, p70: 2400, p90: 3300 ms, p99: 4500 ms), latency samples: 121900
Test Ok

@igor-aptos igor-aptos merged commit 5902ff0 into main Dec 6, 2024
87 of 89 checks passed
@igor-aptos igor-aptos deleted the igor/use_native_vector_move_range branch December 6, 2024 02:01
danielxiangzl pushed a commit that referenced this pull request Dec 12, 2024
…ormance / calibrate gas (#14862)

## Description
Use vector::move_range inside of vector, to optimize `insert`, `remove`, `append`, `trim`. 
Extend aptos-move/e2e-benchmark/src/main.rs to track gas and gas/s, to allow for quick calibration. 
Adding workloads to txn-emitter to be able to use it throughput.

Additionally add a missing `replace` method, which replaces value at particular index. 

Running on extended set of params:
https://gist.github.com/igor-aptos/e8d4e21edcbc75dddcb9382d4e077665

Summary of the performance tests:
- performance doesn't depend on the size of the values (unless they are primitive), as vector modifies only pointers to them.
- operation depends very little on how many elements we need to move - moving 1000 elements (i.e. to insert into a vector 1000 elements from the end) is only (!!) 2x slower than moving 1 element, end-to-end. 

For gas calibration, on the variety of workloads, current implementation has decent variance. After tuning params to match the averages, variance seems much smaller.
danielxiangzl pushed a commit that referenced this pull request Dec 12, 2024
…ormance / calibrate gas (#14862)

## Description
Use vector::move_range inside of vector, to optimize `insert`, `remove`, `append`, `trim`. 
Extend aptos-move/e2e-benchmark/src/main.rs to track gas and gas/s, to allow for quick calibration. 
Adding workloads to txn-emitter to be able to use it throughput.

Additionally add a missing `replace` method, which replaces value at particular index. 

Running on extended set of params:
https://gist.github.com/igor-aptos/e8d4e21edcbc75dddcb9382d4e077665

Summary of the performance tests:
- performance doesn't depend on the size of the values (unless they are primitive), as vector modifies only pointers to them.
- operation depends very little on how many elements we need to move - moving 1000 elements (i.e. to insert into a vector 1000 elements from the end) is only (!!) 2x slower than moving 1 element, end-to-end. 

For gas calibration, on the variety of workloads, current implementation has decent variance. After tuning params to match the averages, variance seems much smaller.
georgemitenkov pushed a commit that referenced this pull request Jan 6, 2025
…ormance / calibrate gas (#14862)

## Description
Use vector::move_range inside of vector, to optimize `insert`, `remove`, `append`, `trim`. 
Extend aptos-move/e2e-benchmark/src/main.rs to track gas and gas/s, to allow for quick calibration. 
Adding workloads to txn-emitter to be able to use it throughput.

Additionally add a missing `replace` method, which replaces value at particular index. 

Running on extended set of params:
https://gist.github.com/igor-aptos/e8d4e21edcbc75dddcb9382d4e077665

Summary of the performance tests:
- performance doesn't depend on the size of the values (unless they are primitive), as vector modifies only pointers to them.
- operation depends very little on how many elements we need to move - moving 1000 elements (i.e. to insert into a vector 1000 elements from the end) is only (!!) 2x slower than moving 1 element, end-to-end. 

For gas calibration, on the variety of workloads, current implementation has decent variance. After tuning params to match the averages, variance seems much smaller.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CICD:run-execution-performance-full-test Run execution performance test (full version) CICD:run-execution-performance-test Run execution performance test CICD:run-framework-upgrade-test
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants