Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] timeit.printevery #2653

Merged
merged 25 commits into from
Dec 16, 2024
Merged

[Feature] timeit.printevery #2653

merged 25 commits into from
Dec 16, 2024

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Dec 15, 2024

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Dec 15, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2653

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 12 Unrelated Failures

As of commit 4c80b85 with merge base 6482766 (image):

NEW FAILURE - The following job has failed:

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 15, 2024
vmoens added a commit that referenced this pull request Dec 15, 2024
ghstack-source-id: 49831b9d3ce0951c4d69ec0817420d16ea9873e7
Pull Request resolved: #2653
Copy link

github-actions bot commented Dec 15, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}6$. Worsened: $\large\color{#d91a1a}9$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.4266s 0.4262s 2.3465 Ops/s 2.2085 Ops/s $\textbf{\color{#35bf28}+6.25\%}$
test_transformed 0.6041s 0.6033s 1.6575 Ops/s 1.6030 Ops/s $\color{#35bf28}+3.40\%$
test_serial 1.3586s 1.3493s 0.7411 Ops/s 0.7327 Ops/s $\color{#35bf28}+1.15\%$
test_parallel 1.3833s 1.3096s 0.7636 Ops/s 0.7537 Ops/s $\color{#35bf28}+1.32\%$
test_step_mdp_speed[True-True-True-True-True] 0.2784ms 30.3537μs 32.9450 KOps/s 33.1016 KOps/s $\color{#d91a1a}-0.47\%$
test_step_mdp_speed[True-True-True-True-False] 65.8520μs 17.9420μs 55.7350 KOps/s 57.7777 KOps/s $\color{#d91a1a}-3.54\%$
test_step_mdp_speed[True-True-True-False-True] 75.2510μs 17.9393μs 55.7434 KOps/s 59.9593 KOps/s $\textbf{\color{#d91a1a}-7.03\%}$
test_step_mdp_speed[True-True-True-False-False] 40.2350μs 10.3472μs 96.6449 KOps/s 102.6501 KOps/s $\textbf{\color{#d91a1a}-5.85\%}$
test_step_mdp_speed[True-True-False-True-True] 80.5610μs 32.9410μs 30.3573 KOps/s 31.7608 KOps/s $\color{#d91a1a}-4.42\%$
test_step_mdp_speed[True-True-False-True-False] 56.7960μs 19.9413μs 50.1473 KOps/s 51.4821 KOps/s $\color{#d91a1a}-2.59\%$
test_step_mdp_speed[True-True-False-False-True] 80.4800μs 19.5801μs 51.0721 KOps/s 53.6267 KOps/s $\color{#d91a1a}-4.76\%$
test_step_mdp_speed[True-True-False-False-False] 42.8300μs 12.1791μs 82.1079 KOps/s 85.7782 KOps/s $\color{#d91a1a}-4.28\%$
test_step_mdp_speed[True-False-True-True-True] 85.6370μs 34.7014μs 28.8172 KOps/s 29.7823 KOps/s $\color{#d91a1a}-3.24\%$
test_step_mdp_speed[True-False-True-True-False] 49.4330μs 21.5497μs 46.4044 KOps/s 47.7034 KOps/s $\color{#d91a1a}-2.72\%$
test_step_mdp_speed[True-False-True-False-True] 50.6250μs 19.5627μs 51.1176 KOps/s 54.5500 KOps/s $\textbf{\color{#d91a1a}-6.29\%}$
test_step_mdp_speed[True-False-True-False-False] 61.1240μs 12.1634μs 82.2137 KOps/s 85.5975 KOps/s $\color{#d91a1a}-3.95\%$
test_step_mdp_speed[True-False-False-True-True] 73.1260μs 36.2882μs 27.5571 KOps/s 28.6430 KOps/s $\color{#d91a1a}-3.79\%$
test_step_mdp_speed[True-False-False-True-False] 75.1810μs 23.3680μs 42.7935 KOps/s 43.4264 KOps/s $\color{#d91a1a}-1.46\%$
test_step_mdp_speed[True-False-False-False-True] 77.2150μs 21.3505μs 46.8372 KOps/s 49.3139 KOps/s $\textbf{\color{#d91a1a}-5.02\%}$
test_step_mdp_speed[True-False-False-False-False] 43.7820μs 14.1393μs 70.7247 KOps/s 74.7793 KOps/s $\textbf{\color{#d91a1a}-5.42\%}$
test_step_mdp_speed[False-True-True-True-True] 84.6380μs 34.9565μs 28.6070 KOps/s 29.8117 KOps/s $\color{#d91a1a}-4.04\%$
test_step_mdp_speed[False-True-True-True-False] 52.3880μs 21.9121μs 45.6369 KOps/s 47.0606 KOps/s $\color{#d91a1a}-3.03\%$
test_step_mdp_speed[False-True-True-False-True] 67.5760μs 22.0399μs 45.3723 KOps/s 47.2608 KOps/s $\color{#d91a1a}-4.00\%$
test_step_mdp_speed[False-True-True-False-False] 71.0530μs 13.5890μs 73.5890 KOps/s 77.1805 KOps/s $\color{#d91a1a}-4.65\%$
test_step_mdp_speed[False-True-False-True-True] 81.1810μs 36.7294μs 27.2262 KOps/s 28.4397 KOps/s $\color{#d91a1a}-4.27\%$
test_step_mdp_speed[False-True-False-True-False] 65.4030μs 23.5729μs 42.4215 KOps/s 43.8539 KOps/s $\color{#d91a1a}-3.27\%$
test_step_mdp_speed[False-True-False-False-True] 2.6584ms 23.7697μs 42.0704 KOps/s 43.6018 KOps/s $\color{#d91a1a}-3.51\%$
test_step_mdp_speed[False-True-False-False-False] 58.1690μs 15.2717μs 65.4806 KOps/s 68.1025 KOps/s $\color{#d91a1a}-3.85\%$
test_step_mdp_speed[False-False-True-True-True] 91.2600μs 38.3705μs 26.0617 KOps/s 27.3390 KOps/s $\color{#d91a1a}-4.67\%$
test_step_mdp_speed[False-False-True-True-False] 52.6590μs 25.5694μs 39.1093 KOps/s 40.9971 KOps/s $\color{#d91a1a}-4.60\%$
test_step_mdp_speed[False-False-True-False-True] 78.1860μs 23.7391μs 42.1246 KOps/s 43.0975 KOps/s $\color{#d91a1a}-2.26\%$
test_step_mdp_speed[False-False-True-False-False] 42.1390μs 15.5297μs 64.3927 KOps/s 68.5318 KOps/s $\textbf{\color{#d91a1a}-6.04\%}$
test_step_mdp_speed[False-False-False-True-True] 90.6900μs 39.7362μs 25.1660 KOps/s 26.2830 KOps/s $\color{#d91a1a}-4.25\%$
test_step_mdp_speed[False-False-False-True-False] 74.9500μs 27.0001μs 37.0369 KOps/s 38.0871 KOps/s $\color{#d91a1a}-2.76\%$
test_step_mdp_speed[False-False-False-False-True] 72.5750μs 25.3414μs 39.4611 KOps/s 41.7162 KOps/s $\textbf{\color{#d91a1a}-5.41\%}$
test_step_mdp_speed[False-False-False-False-False] 69.6400μs 16.9340μs 59.0526 KOps/s 61.9346 KOps/s $\color{#d91a1a}-4.65\%$
test_values[generalized_advantage_estimate-True-True] 9.7586ms 9.4040ms 106.3381 Ops/s 104.1806 Ops/s $\color{#35bf28}+2.07\%$
test_values[vec_generalized_advantage_estimate-True-True] 41.6512ms 35.7554ms 27.9678 Ops/s 28.3363 Ops/s $\color{#d91a1a}-1.30\%$
test_values[td0_return_estimate-False-False] 0.2361ms 0.1756ms 5.6942 KOps/s 5.6521 KOps/s $\color{#35bf28}+0.74\%$
test_values[td1_return_estimate-False-False] 25.3518ms 23.3900ms 42.7532 Ops/s 41.8992 Ops/s $\color{#35bf28}+2.04\%$
test_values[vec_td1_return_estimate-False-False] 37.5981ms 35.4486ms 28.2099 Ops/s 28.2792 Ops/s $\color{#d91a1a}-0.25\%$
test_values[td_lambda_return_estimate-True-False] 33.9778ms 33.3614ms 29.9748 Ops/s 29.4673 Ops/s $\color{#35bf28}+1.72\%$
test_values[vec_td_lambda_return_estimate-True-False] 37.5957ms 35.4368ms 28.2193 Ops/s 28.2957 Ops/s $\color{#d91a1a}-0.27\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 9.1271ms 8.2584ms 121.0883 Ops/s 120.1054 Ops/s $\color{#35bf28}+0.82\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.2608ms 1.8891ms 529.3463 Ops/s 493.2744 Ops/s $\textbf{\color{#35bf28}+7.31\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.4323ms 0.3530ms 2.8328 KOps/s 2.8334 KOps/s $\color{#d91a1a}-0.02\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 51.2291ms 49.7552ms 20.0984 Ops/s 20.5342 Ops/s $\color{#d91a1a}-2.12\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 3.9391ms 3.0341ms 329.5901 Ops/s 332.2278 Ops/s $\color{#d91a1a}-0.79\%$
test_dqn_speed[False-None] 5.7144ms 1.3786ms 725.3752 Ops/s 714.2096 Ops/s $\color{#35bf28}+1.56\%$
test_dqn_speed[False-backward] 1.8982ms 1.8363ms 544.5755 Ops/s 538.5346 Ops/s $\color{#35bf28}+1.12\%$
test_dqn_speed[True-None] 0.7061ms 0.4595ms 2.1763 KOps/s 2.1254 KOps/s $\color{#35bf28}+2.39\%$
test_dqn_speed[True-backward] 0.9081ms 0.8694ms 1.1502 KOps/s 944.5591 Ops/s $\textbf{\color{#35bf28}+21.77\%}$
test_dqn_speed[reduce-overhead-None] 0.8386ms 0.4688ms 2.1331 KOps/s 2.1419 KOps/s $\color{#d91a1a}-0.41\%$
test_dqn_speed[reduce-overhead-backward] 0.9171ms 0.8698ms 1.1496 KOps/s 1.0621 KOps/s $\textbf{\color{#35bf28}+8.24\%}$
test_ddpg_speed[False-None] 3.6306ms 2.8475ms 351.1836 Ops/s 352.4430 Ops/s $\color{#d91a1a}-0.36\%$
test_ddpg_speed[False-backward] 4.1727ms 3.9486ms 253.2575 Ops/s 254.4411 Ops/s $\color{#d91a1a}-0.47\%$
test_ddpg_speed[True-None] 1.4036ms 0.9973ms 1.0027 KOps/s 991.5278 Ops/s $\color{#35bf28}+1.13\%$
test_ddpg_speed[True-backward] 2.0337ms 1.9019ms 525.7820 Ops/s 527.9660 Ops/s $\color{#d91a1a}-0.41\%$
test_ddpg_speed[reduce-overhead-None] 1.4912ms 0.9943ms 1.0058 KOps/s 1.0002 KOps/s $\color{#35bf28}+0.55\%$
test_ddpg_speed[reduce-overhead-backward] 1.9317ms 1.8684ms 535.2261 Ops/s 532.4894 Ops/s $\color{#35bf28}+0.51\%$
test_sac_speed[False-None] 9.3214ms 7.9116ms 126.3968 Ops/s 125.4418 Ops/s $\color{#35bf28}+0.76\%$
test_sac_speed[False-backward] 11.3794ms 10.5655ms 94.6478 Ops/s 94.3066 Ops/s $\color{#35bf28}+0.36\%$
test_sac_speed[True-None] 2.2609ms 1.8171ms 550.3249 Ops/s 549.0342 Ops/s $\color{#35bf28}+0.24\%$
test_sac_speed[True-backward] 3.5869ms 3.5083ms 285.0408 Ops/s 286.5594 Ops/s $\color{#d91a1a}-0.53\%$
test_sac_speed[reduce-overhead-None] 2.1048ms 1.8214ms 549.0280 Ops/s 547.1113 Ops/s $\color{#35bf28}+0.35\%$
test_sac_speed[reduce-overhead-backward] 3.5554ms 3.5121ms 284.7298 Ops/s 284.6936 Ops/s $\color{#35bf28}+0.01\%$
test_redq_speed[False-None] 19.0289ms 13.1681ms 75.9410 Ops/s 77.3692 Ops/s $\color{#d91a1a}-1.85\%$
test_redq_speed[False-backward] 22.6823ms 21.9627ms 45.5317 Ops/s 45.5941 Ops/s $\color{#d91a1a}-0.14\%$
test_redq_speed[True-None] 7.3203ms 4.7047ms 212.5526 Ops/s 211.1052 Ops/s $\color{#35bf28}+0.69\%$
test_redq_speed[True-backward] 13.3957ms 12.0901ms 82.7126 Ops/s 82.6410 Ops/s $\color{#35bf28}+0.09\%$
test_redq_speed[reduce-overhead-None] 5.1626ms 4.6268ms 216.1340 Ops/s 218.3299 Ops/s $\color{#d91a1a}-1.01\%$
test_redq_speed[reduce-overhead-backward] 13.5099ms 12.1995ms 81.9707 Ops/s 84.0497 Ops/s $\color{#d91a1a}-2.47\%$
test_redq_deprec_speed[False-None] 15.2157ms 12.8228ms 77.9861 Ops/s 78.2808 Ops/s $\color{#d91a1a}-0.38\%$
test_redq_deprec_speed[False-backward] 20.9873ms 18.3527ms 54.4878 Ops/s 53.9009 Ops/s $\color{#35bf28}+1.09\%$
test_redq_deprec_speed[True-None] 4.4031ms 3.5998ms 277.7961 Ops/s 271.9453 Ops/s $\color{#35bf28}+2.15\%$
test_redq_deprec_speed[True-backward] 9.6101ms 8.0989ms 123.4731 Ops/s 126.4057 Ops/s $\color{#d91a1a}-2.32\%$
test_redq_deprec_speed[reduce-overhead-None] 3.9541ms 3.5738ms 279.8109 Ops/s 279.5495 Ops/s $\color{#35bf28}+0.09\%$
test_redq_deprec_speed[reduce-overhead-backward] 9.3378ms 8.3631ms 119.5735 Ops/s 122.7054 Ops/s $\color{#d91a1a}-2.55\%$
test_td3_speed[False-None] 8.5173ms 7.9353ms 126.0193 Ops/s 125.4494 Ops/s $\color{#35bf28}+0.45\%$
test_td3_speed[False-backward] 10.4554ms 10.2381ms 97.6741 Ops/s 96.2591 Ops/s $\color{#35bf28}+1.47\%$
test_td3_speed[True-None] 1.9589ms 1.7305ms 577.8739 Ops/s 583.2181 Ops/s $\color{#d91a1a}-0.92\%$
test_td3_speed[True-backward] 3.4269ms 3.3446ms 298.9881 Ops/s 295.5955 Ops/s $\color{#35bf28}+1.15\%$
test_td3_speed[reduce-overhead-None] 1.9722ms 1.7220ms 580.7201 Ops/s 578.7236 Ops/s $\color{#35bf28}+0.34\%$
test_td3_speed[reduce-overhead-backward] 4.0684ms 3.3854ms 295.3857 Ops/s 302.1489 Ops/s $\color{#d91a1a}-2.24\%$
test_cql_speed[False-None] 37.6304ms 36.3771ms 27.4898 Ops/s 27.2670 Ops/s $\color{#35bf28}+0.82\%$
test_cql_speed[False-backward] 51.1760ms 46.9825ms 21.2845 Ops/s 21.2623 Ops/s $\color{#35bf28}+0.10\%$
test_cql_speed[True-None] 16.7465ms 15.5162ms 64.4488 Ops/s 64.5508 Ops/s $\color{#d91a1a}-0.16\%$
test_cql_speed[True-backward] 24.6187ms 22.5266ms 44.3920 Ops/s 44.8001 Ops/s $\color{#d91a1a}-0.91\%$
test_cql_speed[reduce-overhead-None] 16.5074ms 15.6320ms 63.9714 Ops/s 64.1055 Ops/s $\color{#d91a1a}-0.21\%$
test_cql_speed[reduce-overhead-backward] 23.5807ms 22.2528ms 44.9382 Ops/s 45.2745 Ops/s $\color{#d91a1a}-0.74\%$
test_a2c_speed[False-None] 8.9002ms 7.1478ms 139.9025 Ops/s 137.8062 Ops/s $\color{#35bf28}+1.52\%$
test_a2c_speed[False-backward] 15.5898ms 14.1319ms 70.7620 Ops/s 70.6830 Ops/s $\color{#35bf28}+0.11\%$
test_a2c_speed[True-None] 4.9885ms 4.1996ms 238.1152 Ops/s 236.8043 Ops/s $\color{#35bf28}+0.55\%$
test_a2c_speed[True-backward] 11.0087ms 10.6641ms 93.7723 Ops/s 93.3858 Ops/s $\color{#35bf28}+0.41\%$
test_a2c_speed[reduce-overhead-None] 4.8438ms 4.1746ms 239.5431 Ops/s 237.9633 Ops/s $\color{#35bf28}+0.66\%$
test_a2c_speed[reduce-overhead-backward] 11.9167ms 10.6636ms 93.7767 Ops/s 93.5333 Ops/s $\color{#35bf28}+0.26\%$
test_ppo_speed[False-None] 8.8359ms 7.3839ms 135.4291 Ops/s 132.7441 Ops/s $\color{#35bf28}+2.02\%$
test_ppo_speed[False-backward] 17.6107ms 14.6310ms 68.3482 Ops/s 67.3652 Ops/s $\color{#35bf28}+1.46\%$
test_ppo_speed[True-None] 4.2888ms 3.6590ms 273.2961 Ops/s 270.0923 Ops/s $\color{#35bf28}+1.19\%$
test_ppo_speed[True-backward] 9.9065ms 9.5431ms 104.7877 Ops/s 103.1501 Ops/s $\color{#35bf28}+1.59\%$
test_ppo_speed[reduce-overhead-None] 4.1229ms 3.6663ms 272.7510 Ops/s 269.6226 Ops/s $\color{#35bf28}+1.16\%$
test_ppo_speed[reduce-overhead-backward] 9.8839ms 9.5314ms 104.9163 Ops/s 104.1510 Ops/s $\color{#35bf28}+0.73\%$
test_reinforce_speed[False-None] 7.2091ms 6.4522ms 154.9857 Ops/s 153.2483 Ops/s $\color{#35bf28}+1.13\%$
test_reinforce_speed[False-backward] 11.7979ms 9.7018ms 103.0731 Ops/s 102.4100 Ops/s $\color{#35bf28}+0.65\%$
test_reinforce_speed[True-None] 3.2507ms 2.6193ms 381.7785 Ops/s 373.5560 Ops/s $\color{#35bf28}+2.20\%$
test_reinforce_speed[True-backward] 8.8965ms 8.5185ms 117.3911 Ops/s 116.8503 Ops/s $\color{#35bf28}+0.46\%$
test_reinforce_speed[reduce-overhead-None] 4.6518ms 2.6250ms 380.9596 Ops/s 375.0899 Ops/s $\color{#35bf28}+1.56\%$
test_reinforce_speed[reduce-overhead-backward] 9.0479ms 8.5045ms 117.5845 Ops/s 115.9844 Ops/s $\color{#35bf28}+1.38\%$
test_iql_speed[False-None] 33.8947ms 31.8649ms 31.3825 Ops/s 31.5225 Ops/s $\color{#d91a1a}-0.44\%$
test_iql_speed[False-backward] 46.5626ms 44.5409ms 22.4513 Ops/s 22.5097 Ops/s $\color{#d91a1a}-0.26\%$
test_iql_speed[True-None] 12.0401ms 10.3972ms 96.1795 Ops/s 94.7125 Ops/s $\color{#35bf28}+1.55\%$
test_iql_speed[True-backward] 23.5188ms 21.4639ms 46.5898 Ops/s 46.6036 Ops/s $\color{#d91a1a}-0.03\%$
test_iql_speed[reduce-overhead-None] 11.6082ms 10.3995ms 96.1588 Ops/s 94.2232 Ops/s $\color{#35bf28}+2.05\%$
test_iql_speed[reduce-overhead-backward] 22.3536ms 21.3359ms 46.8695 Ops/s 46.3433 Ops/s $\color{#35bf28}+1.14\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 8.0903ms 4.7897ms 208.7811 Ops/s 200.8242 Ops/s $\color{#35bf28}+3.96\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.8853ms 0.5028ms 1.9889 KOps/s 1.9536 KOps/s $\color{#35bf28}+1.80\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6976ms 0.4729ms 2.1148 KOps/s 2.0768 KOps/s $\color{#35bf28}+1.83\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.3515ms 4.5716ms 218.7402 Ops/s 210.1908 Ops/s $\color{#35bf28}+4.07\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.3484ms 0.4932ms 2.0275 KOps/s 2.0199 KOps/s $\color{#35bf28}+0.37\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.8310ms 0.4638ms 2.1560 KOps/s 2.1305 KOps/s $\color{#35bf28}+1.20\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.8994ms 1.5991ms 625.3550 Ops/s 616.0078 Ops/s $\color{#35bf28}+1.52\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 2.3813ms 1.5587ms 641.5664 Ops/s 642.2117 Ops/s $\color{#d91a1a}-0.10\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.2880ms 4.8594ms 205.7853 Ops/s 202.0742 Ops/s $\color{#35bf28}+1.84\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.6962ms 0.6428ms 1.5558 KOps/s 1.5548 KOps/s $\color{#35bf28}+0.06\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.9559ms 0.6092ms 1.6416 KOps/s 1.6158 KOps/s $\color{#35bf28}+1.59\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 4.9411ms 4.7069ms 212.4561 Ops/s 211.0744 Ops/s $\color{#35bf28}+0.65\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.5176ms 0.5010ms 1.9958 KOps/s 1.9790 KOps/s $\color{#35bf28}+0.85\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7749ms 0.4811ms 2.0785 KOps/s 2.0579 KOps/s $\color{#35bf28}+1.00\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.0962ms 4.6708ms 214.0979 Ops/s 209.2573 Ops/s $\color{#35bf28}+2.31\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.7220ms 0.4875ms 2.0513 KOps/s 2.0270 KOps/s $\color{#35bf28}+1.20\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.8312ms 0.4753ms 2.1038 KOps/s 2.1472 KOps/s $\color{#d91a1a}-2.02\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 7.5779ms 4.8371ms 206.7341 Ops/s 204.4848 Ops/s $\color{#35bf28}+1.10\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.0939ms 0.6308ms 1.5853 KOps/s 1.5474 KOps/s $\color{#35bf28}+2.45\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8976ms 0.6092ms 1.6415 KOps/s 1.6355 KOps/s $\color{#35bf28}+0.37\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 5.4897ms 4.1754ms 239.4967 Ops/s 240.2414 Ops/s $\color{#d91a1a}-0.31\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 6.9865ms 2.2773ms 439.1089 Ops/s 408.4470 Ops/s $\textbf{\color{#35bf28}+7.51\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 4.8504ms 1.2860ms 777.5893 Ops/s 767.7793 Ops/s $\color{#35bf28}+1.28\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.3891s 11.9669ms 83.5636 Ops/s 242.8847 Ops/s $\textbf{\color{#d91a1a}-65.60\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 8.0597ms 2.3374ms 427.8170 Ops/s 434.9225 Ops/s $\color{#d91a1a}-1.63\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 2.7083ms 1.3059ms 765.7451 Ops/s 857.4801 Ops/s $\textbf{\color{#d91a1a}-10.70\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 6.9603ms 4.3815ms 228.2347 Ops/s 239.4324 Ops/s $\color{#d91a1a}-4.68\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 5.4252ms 2.4498ms 408.1948 Ops/s 371.9516 Ops/s $\textbf{\color{#35bf28}+9.74\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 7.6785ms 1.5168ms 659.3044 Ops/s 686.1491 Ops/s $\color{#d91a1a}-3.91\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 13.3089ms 11.4136ms 87.6152 Ops/s 84.3439 Ops/s $\color{#35bf28}+3.88\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 15.7824ms 14.7303ms 67.8873 Ops/s 67.0359 Ops/s $\color{#35bf28}+1.27\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 28.3324ms 20.4524ms 48.8941 Ops/s 48.8692 Ops/s $\color{#35bf28}+0.05\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 17.9830ms 15.0912ms 66.2639 Ops/s 66.7646 Ops/s $\color{#d91a1a}-0.75\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 21.7817ms 20.0145ms 49.9639 Ops/s 47.6840 Ops/s $\color{#35bf28}+4.78\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 17.9642ms 16.2870ms 61.3986 Ops/s 60.0805 Ops/s $\color{#35bf28}+2.19\%$

Copy link

github-actions bot commented Dec 15, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}16$. Worsened: $\large\color{#d91a1a}9$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.7568s 0.7505s 1.3324 Ops/s 1.2958 Ops/s $\color{#35bf28}+2.82\%$
test_transformed 1.0012s 1.0010s 0.9990 Ops/s 0.9983 Ops/s $\color{#35bf28}+0.07\%$
test_serial 2.1482s 2.1366s 0.4680 Ops/s 0.4650 Ops/s $\color{#35bf28}+0.65\%$
test_parallel 2.0366s 2.0020s 0.4995 Ops/s 0.5141 Ops/s $\color{#d91a1a}-2.83\%$
test_step_mdp_speed[True-True-True-True-True] 0.1710ms 39.3937μs 25.3848 KOps/s 25.2880 KOps/s $\color{#35bf28}+0.38\%$
test_step_mdp_speed[True-True-True-True-False] 56.8400μs 22.6017μs 44.2445 KOps/s 44.0014 KOps/s $\color{#35bf28}+0.55\%$
test_step_mdp_speed[True-True-True-False-True] 52.2210μs 21.4581μs 46.6023 KOps/s 46.5309 KOps/s $\color{#35bf28}+0.15\%$
test_step_mdp_speed[True-True-True-False-False] 44.8610μs 12.7711μs 78.3016 KOps/s 78.5620 KOps/s $\color{#d91a1a}-0.33\%$
test_step_mdp_speed[True-True-False-True-True] 84.0410μs 43.1167μs 23.1929 KOps/s 23.7601 KOps/s $\color{#d91a1a}-2.39\%$
test_step_mdp_speed[True-True-False-True-False] 54.7410μs 24.7806μs 40.3542 KOps/s 40.7866 KOps/s $\color{#d91a1a}-1.06\%$
test_step_mdp_speed[True-True-False-False-True] 55.5510μs 24.6149μs 40.6258 KOps/s 41.8898 KOps/s $\color{#d91a1a}-3.02\%$
test_step_mdp_speed[True-True-False-False-False] 45.7500μs 15.0324μs 66.5230 KOps/s 66.9499 KOps/s $\color{#d91a1a}-0.64\%$
test_step_mdp_speed[True-False-True-True-True] 85.8720μs 44.8159μs 22.3135 KOps/s 22.6371 KOps/s $\color{#d91a1a}-1.43\%$
test_step_mdp_speed[True-False-True-True-False] 57.5300μs 26.8907μs 37.1875 KOps/s 37.2949 KOps/s $\color{#d91a1a}-0.29\%$
test_step_mdp_speed[True-False-True-False-True] 54.4800μs 24.3936μs 40.9944 KOps/s 41.7017 KOps/s $\color{#d91a1a}-1.70\%$
test_step_mdp_speed[True-False-True-False-False] 49.8200μs 14.8346μs 67.4100 KOps/s 67.0157 KOps/s $\color{#35bf28}+0.59\%$
test_step_mdp_speed[True-False-False-True-True] 92.5010μs 46.7291μs 21.4000 KOps/s 21.4637 KOps/s $\color{#d91a1a}-0.30\%$
test_step_mdp_speed[True-False-False-True-False] 59.5510μs 29.0171μs 34.4624 KOps/s 34.0914 KOps/s $\color{#35bf28}+1.09\%$
test_step_mdp_speed[True-False-False-False-True] 53.2010μs 26.7359μs 37.4028 KOps/s 38.3120 KOps/s $\color{#d91a1a}-2.37\%$
test_step_mdp_speed[True-False-False-False-False] 44.5410μs 17.3543μs 57.6226 KOps/s 59.0721 KOps/s $\color{#d91a1a}-2.45\%$
test_step_mdp_speed[False-True-True-True-True] 77.4910μs 44.1566μs 22.6467 KOps/s 22.5640 KOps/s $\color{#35bf28}+0.37\%$
test_step_mdp_speed[False-True-True-True-False] 58.8700μs 27.3517μs 36.5607 KOps/s 36.7487 KOps/s $\color{#d91a1a}-0.51\%$
test_step_mdp_speed[False-True-True-False-True] 55.3710μs 27.6727μs 36.1367 KOps/s 36.5472 KOps/s $\color{#d91a1a}-1.12\%$
test_step_mdp_speed[False-True-True-False-False] 57.5610μs 16.9637μs 58.9494 KOps/s 60.7277 KOps/s $\color{#d91a1a}-2.93\%$
test_step_mdp_speed[False-True-False-True-True] 90.4420μs 46.0551μs 21.7131 KOps/s 21.5566 KOps/s $\color{#35bf28}+0.73\%$
test_step_mdp_speed[False-True-False-True-False] 64.0510μs 29.4544μs 33.9508 KOps/s 35.5276 KOps/s $\color{#d91a1a}-4.44\%$
test_step_mdp_speed[False-True-False-False-True] 3.1874ms 30.8512μs 32.4136 KOps/s 33.3815 KOps/s $\color{#d91a1a}-2.90\%$
test_step_mdp_speed[False-True-False-False-False] 49.1100μs 19.1048μs 52.3427 KOps/s 53.7421 KOps/s $\color{#d91a1a}-2.60\%$
test_step_mdp_speed[False-False-True-True-True] 88.1410μs 49.0870μs 20.3720 KOps/s 20.1633 KOps/s $\color{#35bf28}+1.04\%$
test_step_mdp_speed[False-False-True-True-False] 0.4005ms 30.7833μs 32.4851 KOps/s 31.3467 KOps/s $\color{#35bf28}+3.63\%$
test_step_mdp_speed[False-False-True-False-True] 54.4710μs 30.3422μs 32.9574 KOps/s 33.4270 KOps/s $\color{#d91a1a}-1.40\%$
test_step_mdp_speed[False-False-True-False-False] 0.3943ms 19.2764μs 51.8768 KOps/s 53.8460 KOps/s $\color{#d91a1a}-3.66\%$
test_step_mdp_speed[False-False-False-True-True] 0.4422ms 50.7197μs 19.7162 KOps/s 19.7140 KOps/s $\color{#35bf28}+0.01\%$
test_step_mdp_speed[False-False-False-True-False] 0.4098ms 33.7469μs 29.6323 KOps/s 29.6745 KOps/s $\color{#d91a1a}-0.14\%$
test_step_mdp_speed[False-False-False-False-True] 0.4389ms 31.7245μs 31.5214 KOps/s 31.8297 KOps/s $\color{#d91a1a}-0.97\%$
test_step_mdp_speed[False-False-False-False-False] 61.3610μs 20.6480μs 48.4309 KOps/s 48.6492 KOps/s $\color{#d91a1a}-0.45\%$
test_values[generalized_advantage_estimate-True-True] 24.3213ms 23.9827ms 41.6968 Ops/s 41.3130 Ops/s $\color{#35bf28}+0.93\%$
test_values[vec_generalized_advantage_estimate-True-True] 0.1057s 3.0064ms 332.6254 Ops/s 356.2518 Ops/s $\textbf{\color{#d91a1a}-6.63\%}$
test_values[td0_return_estimate-False-False] 0.1033ms 79.8119μs 12.5295 KOps/s 12.4863 KOps/s $\color{#35bf28}+0.35\%$
test_values[td1_return_estimate-False-False] 54.0160ms 53.6132ms 18.6521 Ops/s 18.5893 Ops/s $\color{#35bf28}+0.34\%$
test_values[vec_td1_return_estimate-False-False] 1.3351ms 1.0764ms 929.0047 Ops/s 931.4892 Ops/s $\color{#d91a1a}-0.27\%$
test_values[td_lambda_return_estimate-True-False] 87.3484ms 85.3119ms 11.7217 Ops/s 11.7005 Ops/s $\color{#35bf28}+0.18\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.3147ms 1.0687ms 935.7153 Ops/s 929.3070 Ops/s $\color{#35bf28}+0.69\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 25.4532ms 24.6616ms 40.5488 Ops/s 41.4864 Ops/s $\color{#d91a1a}-2.26\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0316ms 0.7464ms 1.3398 KOps/s 1.3315 KOps/s $\color{#35bf28}+0.63\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7644ms 0.6614ms 1.5120 KOps/s 1.5040 KOps/s $\color{#35bf28}+0.53\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5626ms 1.4791ms 676.1039 Ops/s 676.0979 Ops/s $+0.00\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.7525ms 0.6756ms 1.4801 KOps/s 1.4627 KOps/s $\color{#35bf28}+1.19\%$
test_dqn_speed[False-None] 6.9612ms 1.5139ms 660.5354 Ops/s 667.4846 Ops/s $\color{#d91a1a}-1.04\%$
test_dqn_speed[False-backward] 2.1828ms 2.0956ms 477.1925 Ops/s 478.4621 Ops/s $\color{#d91a1a}-0.27\%$
test_dqn_speed[True-None] 0.9570ms 0.5359ms 1.8662 KOps/s 1.8385 KOps/s $\color{#35bf28}+1.50\%$
test_dqn_speed[True-backward] 1.2196ms 1.1858ms 843.2790 Ops/s 822.2642 Ops/s $\color{#35bf28}+2.56\%$
test_dqn_speed[reduce-overhead-None] 0.9418ms 0.5510ms 1.8147 KOps/s 1.7949 KOps/s $\color{#35bf28}+1.10\%$
test_dqn_speed[reduce-overhead-backward] 1.1065ms 1.0604ms 943.0337 Ops/s 932.6142 Ops/s $\color{#35bf28}+1.12\%$
test_ddpg_speed[False-None] 3.1951ms 2.8351ms 352.7155 Ops/s 350.5953 Ops/s $\color{#35bf28}+0.60\%$
test_ddpg_speed[False-backward] 4.5272ms 4.1545ms 240.7015 Ops/s 239.6415 Ops/s $\color{#35bf28}+0.44\%$
test_ddpg_speed[True-None] 1.4682ms 1.0783ms 927.3743 Ops/s 910.1788 Ops/s $\color{#35bf28}+1.89\%$
test_ddpg_speed[True-backward] 2.3969ms 2.2840ms 437.8367 Ops/s 432.3214 Ops/s $\color{#35bf28}+1.28\%$
test_ddpg_speed[reduce-overhead-None] 1.4654ms 1.0796ms 926.3086 Ops/s 907.0254 Ops/s $\color{#35bf28}+2.13\%$
test_ddpg_speed[reduce-overhead-backward] 1.7941ms 1.7529ms 570.4848 Ops/s 559.1052 Ops/s $\color{#35bf28}+2.04\%$
test_sac_speed[False-None] 8.4687ms 7.9808ms 125.3009 Ops/s 124.7757 Ops/s $\color{#35bf28}+0.42\%$
test_sac_speed[False-backward] 11.8259ms 11.1265ms 89.8758 Ops/s 89.2365 Ops/s $\color{#35bf28}+0.72\%$
test_sac_speed[True-None] 1.7037ms 1.5356ms 651.2072 Ops/s 640.8848 Ops/s $\color{#35bf28}+1.61\%$
test_sac_speed[True-backward] 3.6949ms 3.2287ms 309.7184 Ops/s 290.9764 Ops/s $\textbf{\color{#35bf28}+6.44\%}$
test_sac_speed[reduce-overhead-None] 23.3510ms 12.7090ms 78.6846 Ops/s 80.5065 Ops/s $\color{#d91a1a}-2.26\%$
test_sac_speed[reduce-overhead-backward] 1.3859ms 1.3168ms 759.3888 Ops/s 670.8325 Ops/s $\textbf{\color{#35bf28}+13.20\%}$
test_redq_speed[False-None] 8.0933ms 7.4238ms 134.7015 Ops/s 134.0026 Ops/s $\color{#35bf28}+0.52\%$
test_redq_speed[False-backward] 11.8077ms 11.0876ms 90.1911 Ops/s 86.2691 Ops/s $\color{#35bf28}+4.55\%$
test_redq_speed[True-None] 2.1374ms 2.0126ms 496.8713 Ops/s 496.5598 Ops/s $\color{#35bf28}+0.06\%$
test_redq_speed[True-backward] 4.2675ms 3.8438ms 260.1588 Ops/s 257.7831 Ops/s $\color{#35bf28}+0.92\%$
test_redq_speed[reduce-overhead-None] 2.0797ms 1.9870ms 503.2692 Ops/s 499.5020 Ops/s $\color{#35bf28}+0.75\%$
test_redq_speed[reduce-overhead-backward] 4.1531ms 3.8355ms 260.7244 Ops/s 253.2222 Ops/s $\color{#35bf28}+2.96\%$
test_redq_deprec_speed[False-None] 9.8723ms 8.9844ms 111.3037 Ops/s 110.2570 Ops/s $\color{#35bf28}+0.95\%$
test_redq_deprec_speed[False-backward] 12.6588ms 12.0888ms 82.7209 Ops/s 81.5620 Ops/s $\color{#35bf28}+1.42\%$
test_redq_deprec_speed[True-None] 2.4091ms 2.3189ms 431.2413 Ops/s 427.8462 Ops/s $\color{#35bf28}+0.79\%$
test_redq_deprec_speed[True-backward] 4.6028ms 4.1775ms 239.3770 Ops/s 248.9733 Ops/s $\color{#d91a1a}-3.85\%$
test_redq_deprec_speed[reduce-overhead-None] 2.3978ms 2.3166ms 431.6707 Ops/s 428.6456 Ops/s $\color{#35bf28}+0.71\%$
test_redq_deprec_speed[reduce-overhead-backward] 4.6336ms 4.1655ms 240.0678 Ops/s 249.5537 Ops/s $\color{#d91a1a}-3.80\%$
test_td3_speed[False-None] 8.0921ms 7.8625ms 127.1868 Ops/s 127.0621 Ops/s $\color{#35bf28}+0.10\%$
test_td3_speed[False-backward] 10.8187ms 10.3396ms 96.7152 Ops/s 98.8143 Ops/s $\color{#d91a1a}-2.12\%$
test_td3_speed[True-None] 1.6001ms 1.5815ms 632.3004 Ops/s 628.8003 Ops/s $\color{#35bf28}+0.56\%$
test_td3_speed[True-backward] 3.6880ms 3.2735ms 305.4867 Ops/s 318.3516 Ops/s $\color{#d91a1a}-4.04\%$
test_td3_speed[reduce-overhead-None] 82.4503ms 26.3227ms 37.9901 Ops/s 37.0262 Ops/s $\color{#35bf28}+2.60\%$
test_td3_speed[reduce-overhead-backward] 1.5270ms 1.4538ms 687.8379 Ops/s 780.4283 Ops/s $\textbf{\color{#d91a1a}-11.86\%}$
test_cql_speed[False-None] 17.4750ms 16.6034ms 60.2286 Ops/s 59.2413 Ops/s $\color{#35bf28}+1.67\%$
test_cql_speed[False-backward] 22.5741ms 21.9571ms 45.5433 Ops/s 45.6846 Ops/s $\color{#d91a1a}-0.31\%$
test_cql_speed[True-None] 3.0121ms 2.9393ms 340.2184 Ops/s 334.6452 Ops/s $\color{#35bf28}+1.67\%$
test_cql_speed[True-backward] 5.8240ms 5.1718ms 193.3552 Ops/s 187.0576 Ops/s $\color{#35bf28}+3.37\%$
test_cql_speed[reduce-overhead-None] 21.5983ms 13.2069ms 75.7178 Ops/s 76.3791 Ops/s $\color{#d91a1a}-0.87\%$
test_cql_speed[reduce-overhead-backward] 1.5537ms 1.4895ms 671.3709 Ops/s 658.8270 Ops/s $\color{#35bf28}+1.90\%$
test_a2c_speed[False-None] 3.3105ms 3.1582ms 316.6317 Ops/s 311.5682 Ops/s $\color{#35bf28}+1.63\%$
test_a2c_speed[False-backward] 6.6115ms 5.9759ms 167.3402 Ops/s 165.9718 Ops/s $\color{#35bf28}+0.82\%$
test_a2c_speed[True-None] 1.0606ms 0.9956ms 1.0044 KOps/s 978.5186 Ops/s $\color{#35bf28}+2.64\%$
test_a2c_speed[True-backward] 2.5991ms 2.5638ms 390.0522 Ops/s 378.3862 Ops/s $\color{#35bf28}+3.08\%$
test_a2c_speed[reduce-overhead-None] 21.3164ms 11.6414ms 85.9004 Ops/s 87.1648 Ops/s $\color{#d91a1a}-1.45\%$
test_a2c_speed[reduce-overhead-backward] 1.0178ms 0.9608ms 1.0408 KOps/s 1.0208 KOps/s $\color{#35bf28}+1.96\%$
test_ppo_speed[False-None] 3.8920ms 3.7794ms 264.5933 Ops/s 274.1093 Ops/s $\color{#d91a1a}-3.47\%$
test_ppo_speed[False-backward] 7.1573ms 6.7058ms 149.1254 Ops/s 148.8126 Ops/s $\color{#35bf28}+0.21\%$
test_ppo_speed[True-None] 1.0174ms 0.9542ms 1.0480 KOps/s 1.0457 KOps/s $\color{#35bf28}+0.22\%$
test_ppo_speed[True-backward] 2.9954ms 2.5240ms 396.1944 Ops/s 365.4184 Ops/s $\textbf{\color{#35bf28}+8.42\%}$
test_ppo_speed[reduce-overhead-None] 0.5858ms 0.5104ms 1.9593 KOps/s 1.9136 KOps/s $\color{#35bf28}+2.39\%$
test_ppo_speed[reduce-overhead-backward] 1.0990ms 1.0434ms 958.4100 Ops/s 885.0085 Ops/s $\textbf{\color{#35bf28}+8.29\%}$
test_reinforce_speed[False-None] 2.3313ms 2.2432ms 445.7980 Ops/s 441.2413 Ops/s $\color{#35bf28}+1.03\%$
test_reinforce_speed[False-backward] 3.4324ms 3.3676ms 296.9502 Ops/s 296.1399 Ops/s $\color{#35bf28}+0.27\%$
test_reinforce_speed[True-None] 0.9222ms 0.8233ms 1.2146 KOps/s 1.1895 KOps/s $\color{#35bf28}+2.11\%$
test_reinforce_speed[True-backward] 2.6345ms 2.5830ms 387.1444 Ops/s 387.2012 Ops/s $\color{#d91a1a}-0.01\%$
test_reinforce_speed[reduce-overhead-None] 22.0449ms 11.7086ms 85.4072 Ops/s 87.2810 Ops/s $\color{#d91a1a}-2.15\%$
test_reinforce_speed[reduce-overhead-backward] 1.1583ms 1.1193ms 893.4526 Ops/s 838.4192 Ops/s $\textbf{\color{#35bf28}+6.56\%}$
test_iql_speed[False-None] 9.6620ms 9.1519ms 109.2671 Ops/s 109.6785 Ops/s $\color{#d91a1a}-0.38\%$
test_iql_speed[False-backward] 13.4540ms 12.8679ms 77.7126 Ops/s 76.8612 Ops/s $\color{#35bf28}+1.11\%$
test_iql_speed[True-None] 1.9214ms 1.7662ms 566.1917 Ops/s 568.4901 Ops/s $\color{#d91a1a}-0.40\%$
test_iql_speed[True-backward] 4.5348ms 4.4229ms 226.0965 Ops/s 233.0033 Ops/s $\color{#d91a1a}-2.96\%$
test_iql_speed[reduce-overhead-None] 21.2208ms 11.7449ms 85.1431 Ops/s 89.5649 Ops/s $\color{#d91a1a}-4.94\%$
test_iql_speed[reduce-overhead-backward] 1.4574ms 1.4105ms 708.9918 Ops/s 706.5880 Ops/s $\color{#35bf28}+0.34\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.9179ms 6.4624ms 154.7405 Ops/s 153.5406 Ops/s $\color{#35bf28}+0.78\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.5439ms 0.3632ms 2.7536 KOps/s 3.5297 KOps/s $\textbf{\color{#d91a1a}-21.99\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5195ms 0.2787ms 3.5879 KOps/s 2.9559 KOps/s $\textbf{\color{#35bf28}+21.38\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.4625ms 6.2047ms 161.1691 Ops/s 160.8836 Ops/s $\color{#35bf28}+0.18\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.9492ms 0.2616ms 3.8224 KOps/s 2.8262 KOps/s $\textbf{\color{#35bf28}+35.25\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.4446ms 0.2414ms 4.1424 KOps/s 3.3849 KOps/s $\textbf{\color{#35bf28}+22.38\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.4216ms 1.2352ms 809.6011 Ops/s 816.6799 Ops/s $\color{#d91a1a}-0.87\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.5233ms 1.3129ms 761.7012 Ops/s 850.8265 Ops/s $\textbf{\color{#d91a1a}-10.48\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.5712ms 6.3812ms 156.7106 Ops/s 157.4974 Ops/s $\color{#d91a1a}-0.50\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.0539ms 0.4391ms 2.2776 KOps/s 2.3887 KOps/s $\color{#d91a1a}-4.65\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7318ms 0.4462ms 2.2411 KOps/s 2.5234 KOps/s $\textbf{\color{#d91a1a}-11.19\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.3591ms 6.2668ms 159.5708 Ops/s 160.6329 Ops/s $\color{#d91a1a}-0.66\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.6449ms 0.3432ms 2.9140 KOps/s 2.7861 KOps/s $\color{#35bf28}+4.59\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5517ms 0.3008ms 3.3242 KOps/s 2.9426 KOps/s $\textbf{\color{#35bf28}+12.97\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.5941ms 6.1967ms 161.3761 Ops/s 161.2357 Ops/s $\color{#35bf28}+0.09\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.6408ms 0.3026ms 3.3046 KOps/s 3.8171 KOps/s $\textbf{\color{#d91a1a}-13.43\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5399ms 0.3171ms 3.1540 KOps/s 3.9887 KOps/s $\textbf{\color{#d91a1a}-20.93\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.5462ms 6.3403ms 157.7216 Ops/s 157.5669 Ops/s $\color{#35bf28}+0.10\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.8846ms 0.4892ms 2.0441 KOps/s 2.2332 KOps/s $\textbf{\color{#d91a1a}-8.47\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6896ms 0.4844ms 2.0646 KOps/s 2.5561 KOps/s $\textbf{\color{#d91a1a}-19.23\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 6.9541ms 5.2795ms 189.4107 Ops/s 188.4397 Ops/s $\color{#35bf28}+0.52\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 9.5516ms 2.0494ms 487.9450 Ops/s 400.0189 Ops/s $\textbf{\color{#35bf28}+21.98\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 7.2697ms 1.2196ms 819.9548 Ops/s 775.9718 Ops/s $\textbf{\color{#35bf28}+5.67\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 7.1167ms 5.3126ms 188.2323 Ops/s 182.8404 Ops/s $\color{#35bf28}+2.95\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 9.2713ms 2.0464ms 488.6567 Ops/s 442.4940 Ops/s $\textbf{\color{#35bf28}+10.43\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 6.9248ms 1.1706ms 854.2307 Ops/s 803.7654 Ops/s $\textbf{\color{#35bf28}+6.28\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.5045s 15.4681ms 64.6492 Ops/s 32.2732 Ops/s $\textbf{\color{#35bf28}+100.32\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 10.5647ms 2.3360ms 428.0826 Ops/s 439.1562 Ops/s $\color{#d91a1a}-2.52\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 2.5337ms 1.2832ms 779.2875 Ops/s 703.5832 Ops/s $\textbf{\color{#35bf28}+10.76\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 14.7396ms 13.6288ms 73.3738 Ops/s 74.1317 Ops/s $\color{#d91a1a}-1.02\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 19.9107ms 17.4554ms 57.2889 Ops/s 56.9594 Ops/s $\color{#35bf28}+0.58\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 19.7622ms 17.9124ms 55.8272 Ops/s 53.8997 Ops/s $\color{#35bf28}+3.58\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 20.1944ms 17.6951ms 56.5129 Ops/s 56.5003 Ops/s $\color{#35bf28}+0.02\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 17.9445ms 17.5769ms 56.8928 Ops/s 52.9288 Ops/s $\textbf{\color{#35bf28}+7.49\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 20.8527ms 18.9890ms 52.6621 Ops/s 51.0833 Ops/s $\color{#35bf28}+3.09\%$

[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
@vmoens vmoens merged commit 4c80b85 into gh/vmoens/54/base Dec 16, 2024
50 of 59 checks passed
vmoens added a commit that referenced this pull request Dec 16, 2024
ghstack-source-id: 19165bbfbea5cdc0a6b159493fb02571bab872f3
Pull Request resolved: #2653
@vmoens vmoens deleted the gh/vmoens/54/head branch December 16, 2024 01:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants