Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] DT compatibility with compile #2556

Merged
merged 52 commits into from
Dec 14, 2024
Merged

Conversation

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Nov 12, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2556

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

✅ You can merge normally! (19 Unrelated Failures)

As of commit c628d6f with merge base e2be42e (image):

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 12, 2024
vmoens added a commit that referenced this pull request Nov 12, 2024
ghstack-source-id: 373547e3913f56d750d6a5a0830c320b6d826f1e
Pull Request resolved: #2556
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Nov 12, 2024
ghstack-source-id: c990285cc34f1c5c4bb4001e0ba8bf305ffc8b47
Pull Request resolved: #2556
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Nov 12, 2024
ghstack-source-id: 36a197082f4f62873b2911dc5538d70b384d75f7
Pull Request resolved: #2556
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Nov 12, 2024
ghstack-source-id: ab975d9dec0497a46ed023ef6f6fe00d376422e4
Pull Request resolved: #2556
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Nov 12, 2024
ghstack-source-id: 864098bd2b1a8bd01e258c811535b08432254c2c
Pull Request resolved: #2556
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Nov 12, 2024
ghstack-source-id: f3f6fb626ba887f99c230e46d543228a83162a6d
Pull Request resolved: #2556
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Nov 12, 2024
ghstack-source-id: e5b599dede5243d774d75d86beeb5b68cdd489c6
Pull Request resolved: #2556
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Nov 12, 2024
ghstack-source-id: 54927caa63acf2adf2a71683597d76020831af7c
Pull Request resolved: #2556
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Nov 12, 2024
ghstack-source-id: e10da2481fe53dc3e03b8f60150a1bef2d0c2b4e
Pull Request resolved: #2556
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Nov 14, 2024
ghstack-source-id: c20aeba4a73691a824438197bc68f519d95eaba0
Pull Request resolved: #2556
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Nov 14, 2024
ghstack-source-id: 1f056fc8ff3ba08a3bfb9222a56567afe8b43794
Pull Request resolved: #2556
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Nov 15, 2024
ghstack-source-id: 24372e26606a06a8144d7ef3c7f7252cd6194721
Pull Request resolved: #2556
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
Copy link

github-actions bot commented Dec 13, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}9$. Worsened: $\large\color{#d91a1a}4$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.4287s 0.4270s 2.3421 Ops/s 2.2511 Ops/s $\color{#35bf28}+4.04\%$
test_transformed 0.5999s 0.5984s 1.6712 Ops/s 1.6197 Ops/s $\color{#35bf28}+3.18\%$
test_serial 1.3551s 1.3431s 0.7445 Ops/s 0.7266 Ops/s $\color{#35bf28}+2.47\%$
test_parallel 1.2895s 1.2786s 0.7821 Ops/s 0.7694 Ops/s $\color{#35bf28}+1.66\%$
test_step_mdp_speed[True-True-True-True-True] 0.1697ms 29.7709μs 33.5898 KOps/s 34.4097 KOps/s $\color{#d91a1a}-2.38\%$
test_step_mdp_speed[True-True-True-True-False] 63.4150μs 17.6534μs 56.6464 KOps/s 56.7890 KOps/s $\color{#d91a1a}-0.25\%$
test_step_mdp_speed[True-True-True-False-True] 80.6610μs 16.9575μs 58.9710 KOps/s 60.2414 KOps/s $\color{#d91a1a}-2.11\%$
test_step_mdp_speed[True-True-True-False-False] 51.6660μs 9.9323μs 100.6816 KOps/s 100.7577 KOps/s $\color{#d91a1a}-0.08\%$
test_step_mdp_speed[True-True-False-True-True] 73.2370μs 31.6500μs 31.5956 KOps/s 31.5425 KOps/s $\color{#35bf28}+0.17\%$
test_step_mdp_speed[True-True-False-True-False] 62.6880μs 19.3600μs 51.6529 KOps/s 51.5061 KOps/s $\color{#35bf28}+0.29\%$
test_step_mdp_speed[True-True-False-False-True] 63.4090μs 18.4130μs 54.3094 KOps/s 54.5568 KOps/s $\color{#d91a1a}-0.45\%$
test_step_mdp_speed[True-True-False-False-False] 39.1030μs 11.5967μs 86.2313 KOps/s 86.3738 KOps/s $\color{#d91a1a}-0.16\%$
test_step_mdp_speed[True-False-True-True-True] 98.6040μs 33.7154μs 29.6600 KOps/s 30.4071 KOps/s $\color{#d91a1a}-2.46\%$
test_step_mdp_speed[True-False-True-True-False] 60.5430μs 21.1307μs 47.3245 KOps/s 47.2708 KOps/s $\color{#35bf28}+0.11\%$
test_step_mdp_speed[True-False-True-False-True] 82.7350μs 18.5991μs 53.7661 KOps/s 54.8027 KOps/s $\color{#d91a1a}-1.89\%$
test_step_mdp_speed[True-False-True-False-False] 37.8010μs 11.7172μs 85.3447 KOps/s 86.2278 KOps/s $\color{#d91a1a}-1.02\%$
test_step_mdp_speed[True-False-False-True-True] 0.1089ms 35.1563μs 28.4444 KOps/s 28.2556 KOps/s $\color{#35bf28}+0.67\%$
test_step_mdp_speed[True-False-False-True-False] 97.6120μs 22.8290μs 43.8040 KOps/s 43.8966 KOps/s $\color{#d91a1a}-0.21\%$
test_step_mdp_speed[True-False-False-False-True] 59.0700μs 20.0617μs 49.8462 KOps/s 50.2537 KOps/s $\color{#d91a1a}-0.81\%$
test_step_mdp_speed[True-False-False-False-False] 60.4030μs 13.3637μs 74.8295 KOps/s 76.1469 KOps/s $\color{#d91a1a}-1.73\%$
test_step_mdp_speed[False-True-True-True-True] 66.0640μs 33.2775μs 30.0503 KOps/s 30.2205 KOps/s $\color{#d91a1a}-0.56\%$
test_step_mdp_speed[False-True-True-True-False] 71.8250μs 21.1384μs 47.3072 KOps/s 46.7812 KOps/s $\color{#35bf28}+1.12\%$
test_step_mdp_speed[False-True-True-False-True] 56.7960μs 21.0991μs 47.3955 KOps/s 47.9588 KOps/s $\color{#d91a1a}-1.17\%$
test_step_mdp_speed[False-True-True-False-False] 64.5410μs 12.9061μs 77.4830 KOps/s 77.1442 KOps/s $\color{#35bf28}+0.44\%$
test_step_mdp_speed[False-True-False-True-True] 75.4300μs 34.8524μs 28.6924 KOps/s 28.4155 KOps/s $\color{#35bf28}+0.97\%$
test_step_mdp_speed[False-True-False-True-False] 70.2810μs 22.6804μs 44.0909 KOps/s 43.7975 KOps/s $\color{#35bf28}+0.67\%$
test_step_mdp_speed[False-True-False-False-True] 2.6372ms 22.8534μs 43.7572 KOps/s 44.3809 KOps/s $\color{#d91a1a}-1.41\%$
test_step_mdp_speed[False-True-False-False-False] 54.1610μs 14.6912μs 68.0679 KOps/s 68.5289 KOps/s $\color{#d91a1a}-0.67\%$
test_step_mdp_speed[False-False-True-True-True] 0.1227ms 36.9529μs 27.0615 KOps/s 27.0393 KOps/s $\color{#35bf28}+0.08\%$
test_step_mdp_speed[False-False-True-True-False] 63.6590μs 24.8089μs 40.3082 KOps/s 40.6346 KOps/s $\color{#d91a1a}-0.80\%$
test_step_mdp_speed[False-False-True-False-True] 72.8760μs 22.9447μs 43.5831 KOps/s 44.6674 KOps/s $\color{#d91a1a}-2.43\%$
test_step_mdp_speed[False-False-True-False-False] 50.1440μs 14.6350μs 68.3295 KOps/s 68.7784 KOps/s $\color{#d91a1a}-0.65\%$
test_step_mdp_speed[False-False-False-True-True] 94.2860μs 38.0606μs 26.2739 KOps/s 26.4324 KOps/s $\color{#d91a1a}-0.60\%$
test_step_mdp_speed[False-False-False-True-False] 73.6380μs 26.3209μs 37.9927 KOps/s 38.5819 KOps/s $\color{#d91a1a}-1.53\%$
test_step_mdp_speed[False-False-False-False-True] 53.2990μs 24.1374μs 41.4296 KOps/s 41.6292 KOps/s $\color{#d91a1a}-0.48\%$
test_step_mdp_speed[False-False-False-False-False] 64.4700μs 16.0893μs 62.1531 KOps/s 62.4344 KOps/s $\color{#d91a1a}-0.45\%$
test_values[generalized_advantage_estimate-True-True] 9.6969ms 9.3819ms 106.5884 Ops/s 102.5639 Ops/s $\color{#35bf28}+3.92\%$
test_values[vec_generalized_advantage_estimate-True-True] 36.4310ms 33.7109ms 29.6640 Ops/s 30.1320 Ops/s $\color{#d91a1a}-1.55\%$
test_values[td0_return_estimate-False-False] 0.2288ms 0.1762ms 5.6765 KOps/s 5.5367 KOps/s $\color{#35bf28}+2.52\%$
test_values[td1_return_estimate-False-False] 27.6806ms 23.7150ms 42.1674 Ops/s 42.0162 Ops/s $\color{#35bf28}+0.36\%$
test_values[vec_td1_return_estimate-False-False] 34.5607ms 33.3786ms 29.9594 Ops/s 29.9170 Ops/s $\color{#35bf28}+0.14\%$
test_values[td_lambda_return_estimate-True-False] 34.9219ms 34.2633ms 29.1858 Ops/s 28.8751 Ops/s $\color{#35bf28}+1.08\%$
test_values[vec_td_lambda_return_estimate-True-False] 35.5549ms 33.6011ms 29.7609 Ops/s 29.7800 Ops/s $\color{#d91a1a}-0.06\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 8.6382ms 8.2713ms 120.9001 Ops/s 120.3950 Ops/s $\color{#35bf28}+0.42\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.5492ms 1.9948ms 501.2919 Ops/s 483.9513 Ops/s $\color{#35bf28}+3.58\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.5165ms 0.3605ms 2.7742 KOps/s 2.7746 KOps/s $\color{#d91a1a}-0.01\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 47.1016ms 43.5607ms 22.9565 Ops/s 24.3277 Ops/s $\textbf{\color{#d91a1a}-5.64\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 3.9408ms 3.0375ms 329.2231 Ops/s 329.6454 Ops/s $\color{#d91a1a}-0.13\%$
test_dqn_speed[False-None] 5.4914ms 1.3963ms 716.1710 Ops/s 722.4351 Ops/s $\color{#d91a1a}-0.87\%$
test_dqn_speed[False-backward] 1.9439ms 1.8770ms 532.7782 Ops/s 534.6496 Ops/s $\color{#d91a1a}-0.35\%$
test_dqn_speed[True-None] 0.6252ms 0.4650ms 2.1505 KOps/s 2.1279 KOps/s $\color{#35bf28}+1.07\%$
test_dqn_speed[True-backward] 0.9607ms 0.8843ms 1.1309 KOps/s 1.1268 KOps/s $\color{#35bf28}+0.36\%$
test_dqn_speed[reduce-overhead-None] 0.8343ms 0.4797ms 2.0845 KOps/s 2.1483 KOps/s $\color{#d91a1a}-2.97\%$
test_dqn_speed[reduce-overhead-backward] 0.9426ms 0.8837ms 1.1315 KOps/s 1.1258 KOps/s $\color{#35bf28}+0.51\%$
test_ddpg_speed[False-None] 3.8219ms 2.8823ms 346.9475 Ops/s 351.1196 Ops/s $\color{#d91a1a}-1.19\%$
test_ddpg_speed[False-backward] 4.0781ms 3.9780ms 251.3847 Ops/s 251.2296 Ops/s $\color{#35bf28}+0.06\%$
test_ddpg_speed[True-None] 1.1927ms 0.9944ms 1.0056 KOps/s 978.4321 Ops/s $\color{#35bf28}+2.78\%$
test_ddpg_speed[True-backward] 1.9335ms 1.8856ms 530.3382 Ops/s 527.4531 Ops/s $\color{#35bf28}+0.55\%$
test_ddpg_speed[reduce-overhead-None] 1.2645ms 0.9942ms 1.0058 KOps/s 956.3748 Ops/s $\textbf{\color{#35bf28}+5.17\%}$
test_ddpg_speed[reduce-overhead-backward] 1.9387ms 1.8802ms 531.8716 Ops/s 528.3831 Ops/s $\color{#35bf28}+0.66\%$
test_sac_speed[False-None] 10.2163ms 8.0638ms 124.0113 Ops/s 124.8896 Ops/s $\color{#d91a1a}-0.70\%$
test_sac_speed[False-backward] 11.9818ms 10.8132ms 92.4792 Ops/s 93.0899 Ops/s $\color{#d91a1a}-0.66\%$
test_sac_speed[True-None] 2.2701ms 1.8232ms 548.4966 Ops/s 545.4509 Ops/s $\color{#35bf28}+0.56\%$
test_sac_speed[True-backward] 3.5776ms 3.5333ms 283.0196 Ops/s 284.3497 Ops/s $\color{#d91a1a}-0.47\%$
test_sac_speed[reduce-overhead-None] 2.3610ms 1.8368ms 544.4131 Ops/s 545.7962 Ops/s $\color{#d91a1a}-0.25\%$
test_sac_speed[reduce-overhead-backward] 3.6732ms 3.5354ms 282.8529 Ops/s 285.2942 Ops/s $\color{#d91a1a}-0.86\%$
test_redq_speed[False-None] 15.9162ms 12.8090ms 78.0703 Ops/s 78.0657 Ops/s $+0.01\%$
test_redq_speed[False-backward] 23.5577ms 22.1119ms 45.2245 Ops/s 45.0038 Ops/s $\color{#35bf28}+0.49\%$
test_redq_speed[True-None] 5.2391ms 4.4852ms 222.9567 Ops/s 223.9597 Ops/s $\color{#d91a1a}-0.45\%$
test_redq_speed[True-backward] 13.2120ms 11.8430ms 84.4383 Ops/s 85.5408 Ops/s $\color{#d91a1a}-1.29\%$
test_redq_speed[reduce-overhead-None] 5.3869ms 4.4905ms 222.6900 Ops/s 224.4701 Ops/s $\color{#d91a1a}-0.79\%$
test_redq_speed[reduce-overhead-backward] 12.3072ms 11.8718ms 84.2330 Ops/s 84.7150 Ops/s $\color{#d91a1a}-0.57\%$
test_redq_deprec_speed[False-None] 14.5893ms 12.7606ms 78.3660 Ops/s 74.6033 Ops/s $\textbf{\color{#35bf28}+5.04\%}$
test_redq_deprec_speed[False-backward] 19.2467ms 18.5401ms 53.9372 Ops/s 54.4333 Ops/s $\color{#d91a1a}-0.91\%$
test_redq_deprec_speed[True-None] 4.2325ms 3.5655ms 280.4643 Ops/s 281.2236 Ops/s $\color{#d91a1a}-0.27\%$
test_redq_deprec_speed[True-backward] 8.0249ms 7.9041ms 126.5173 Ops/s 125.6162 Ops/s $\color{#35bf28}+0.72\%$
test_redq_deprec_speed[reduce-overhead-None] 4.2600ms 3.5554ms 281.2641 Ops/s 282.3877 Ops/s $\color{#d91a1a}-0.40\%$
test_redq_deprec_speed[reduce-overhead-backward] 8.9400ms 7.9966ms 125.0529 Ops/s 125.7077 Ops/s $\color{#d91a1a}-0.52\%$
test_td3_speed[False-None] 8.5402ms 7.9330ms 126.0560 Ops/s 124.5816 Ops/s $\color{#35bf28}+1.18\%$
test_td3_speed[False-backward] 10.8815ms 10.3888ms 96.2575 Ops/s 96.0067 Ops/s $\color{#35bf28}+0.26\%$
test_td3_speed[True-None] 1.8367ms 1.6976ms 589.0739 Ops/s 578.4113 Ops/s $\color{#35bf28}+1.84\%$
test_td3_speed[True-backward] 3.3620ms 3.3057ms 302.5101 Ops/s 294.3963 Ops/s $\color{#35bf28}+2.76\%$
test_td3_speed[reduce-overhead-None] 1.9732ms 1.7142ms 583.3541 Ops/s 576.5160 Ops/s $\color{#35bf28}+1.19\%$
test_td3_speed[reduce-overhead-backward] 3.4736ms 3.3065ms 302.4375 Ops/s 301.9400 Ops/s $\color{#35bf28}+0.16\%$
test_cql_speed[False-None] 38.1713ms 36.3209ms 27.5324 Ops/s 27.1925 Ops/s $\color{#35bf28}+1.25\%$
test_cql_speed[False-backward] 51.8595ms 47.1037ms 21.2297 Ops/s 21.5278 Ops/s $\color{#d91a1a}-1.38\%$
test_cql_speed[True-None] 16.2862ms 15.3825ms 65.0088 Ops/s 63.0558 Ops/s $\color{#35bf28}+3.10\%$
test_cql_speed[True-backward] 22.9836ms 21.9605ms 45.5363 Ops/s 45.1037 Ops/s $\color{#35bf28}+0.96\%$
test_cql_speed[reduce-overhead-None] 16.4846ms 15.3791ms 65.0233 Ops/s 64.9079 Ops/s $\color{#35bf28}+0.18\%$
test_cql_speed[reduce-overhead-backward] 22.7211ms 21.9212ms 45.6180 Ops/s 45.0987 Ops/s $\color{#35bf28}+1.15\%$
test_a2c_speed[False-None] 9.2538ms 7.1123ms 140.6021 Ops/s 137.8161 Ops/s $\color{#35bf28}+2.02\%$
test_a2c_speed[False-backward] 15.4909ms 14.1258ms 70.7924 Ops/s 70.3583 Ops/s $\color{#35bf28}+0.62\%$
test_a2c_speed[True-None] 4.8044ms 4.1526ms 240.8151 Ops/s 234.4472 Ops/s $\color{#35bf28}+2.72\%$
test_a2c_speed[True-backward] 10.9218ms 10.5524ms 94.7656 Ops/s 93.1168 Ops/s $\color{#35bf28}+1.77\%$
test_a2c_speed[reduce-overhead-None] 4.6887ms 4.1806ms 239.2028 Ops/s 237.9475 Ops/s $\color{#35bf28}+0.53\%$
test_a2c_speed[reduce-overhead-backward] 11.0135ms 10.5477ms 94.8075 Ops/s 93.0286 Ops/s $\color{#35bf28}+1.91\%$
test_ppo_speed[False-None] 8.7852ms 7.3880ms 135.3547 Ops/s 133.9662 Ops/s $\color{#35bf28}+1.04\%$
test_ppo_speed[False-backward] 15.6383ms 14.7749ms 67.6823 Ops/s 68.1605 Ops/s $\color{#d91a1a}-0.70\%$
test_ppo_speed[True-None] 4.5088ms 3.6696ms 272.5077 Ops/s 271.1783 Ops/s $\color{#35bf28}+0.49\%$
test_ppo_speed[True-backward] 9.8237ms 9.4768ms 105.5213 Ops/s 104.5773 Ops/s $\color{#35bf28}+0.90\%$
test_ppo_speed[reduce-overhead-None] 4.7837ms 3.6549ms 273.6088 Ops/s 269.4567 Ops/s $\color{#35bf28}+1.54\%$
test_ppo_speed[reduce-overhead-backward] 9.9708ms 9.5139ms 105.1093 Ops/s 104.7594 Ops/s $\color{#35bf28}+0.33\%$
test_reinforce_speed[False-None] 7.5355ms 6.4546ms 154.9291 Ops/s 153.5276 Ops/s $\color{#35bf28}+0.91\%$
test_reinforce_speed[False-backward] 10.4725ms 9.7552ms 102.5097 Ops/s 101.4007 Ops/s $\color{#35bf28}+1.09\%$
test_reinforce_speed[True-None] 3.0947ms 2.6104ms 383.0782 Ops/s 380.1345 Ops/s $\color{#35bf28}+0.77\%$
test_reinforce_speed[True-backward] 8.8884ms 8.4943ms 117.7261 Ops/s 116.5993 Ops/s $\color{#35bf28}+0.97\%$
test_reinforce_speed[reduce-overhead-None] 2.9678ms 2.6019ms 384.3377 Ops/s 378.1480 Ops/s $\color{#35bf28}+1.64\%$
test_reinforce_speed[reduce-overhead-backward] 9.4587ms 8.5270ms 117.2751 Ops/s 116.8241 Ops/s $\color{#35bf28}+0.39\%$
test_iql_speed[False-None] 33.1754ms 31.4326ms 31.8141 Ops/s 31.0523 Ops/s $\color{#35bf28}+2.45\%$
test_iql_speed[False-backward] 46.3469ms 44.4739ms 22.4851 Ops/s 22.0880 Ops/s $\color{#35bf28}+1.80\%$
test_iql_speed[True-None] 11.1077ms 10.4157ms 96.0088 Ops/s 94.7109 Ops/s $\color{#35bf28}+1.37\%$
test_iql_speed[True-backward] 22.4492ms 21.2244ms 47.1156 Ops/s 46.6123 Ops/s $\color{#35bf28}+1.08\%$
test_iql_speed[reduce-overhead-None] 11.0821ms 10.4070ms 96.0894 Ops/s 94.6422 Ops/s $\color{#35bf28}+1.53\%$
test_iql_speed[reduce-overhead-backward] 22.6054ms 21.5450ms 46.4146 Ops/s 45.6042 Ops/s $\color{#35bf28}+1.78\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.5135ms 5.0310ms 198.7664 Ops/s 198.8778 Ops/s $\color{#d91a1a}-0.06\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.8886ms 0.5133ms 1.9480 KOps/s 1.9106 KOps/s $\color{#35bf28}+1.96\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6979ms 0.4849ms 2.0624 KOps/s 2.0648 KOps/s $\color{#d91a1a}-0.11\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.0736ms 4.6518ms 214.9698 Ops/s 205.6934 Ops/s $\color{#35bf28}+4.51\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.3284s 0.7366ms 1.3576 KOps/s 1.9940 KOps/s $\textbf{\color{#d91a1a}-31.92\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.8374ms 0.4712ms 2.1223 KOps/s 2.0980 KOps/s $\color{#35bf28}+1.16\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 2.1503ms 1.6120ms 620.3491 Ops/s 609.4323 Ops/s $\color{#35bf28}+1.79\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 2.3068ms 1.5678ms 637.8302 Ops/s 626.5476 Ops/s $\color{#35bf28}+1.80\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.0920ms 4.7744ms 209.4493 Ops/s 203.0226 Ops/s $\color{#35bf28}+3.17\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.3967ms 0.6295ms 1.5886 KOps/s 1.4881 KOps/s $\textbf{\color{#35bf28}+6.75\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 1.0181ms 0.6129ms 1.6315 KOps/s 1.5938 KOps/s $\color{#35bf28}+2.37\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.3288ms 4.6973ms 212.8899 Ops/s 204.3797 Ops/s $\color{#35bf28}+4.16\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.0750ms 0.5123ms 1.9521 KOps/s 1.9334 KOps/s $\color{#35bf28}+0.97\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6932ms 0.4824ms 2.0729 KOps/s 2.0208 KOps/s $\color{#35bf28}+2.57\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.1510ms 4.6163ms 216.6260 Ops/s 211.7490 Ops/s $\color{#35bf28}+2.30\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.2650ms 0.4940ms 2.0242 KOps/s 1.8796 KOps/s $\textbf{\color{#35bf28}+7.69\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.8420ms 0.4734ms 2.1123 KOps/s 2.0914 KOps/s $\color{#35bf28}+1.00\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.1253ms 4.7819ms 209.1206 Ops/s 200.3617 Ops/s $\color{#35bf28}+4.37\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.1271ms 0.6414ms 1.5592 KOps/s 1.5316 KOps/s $\color{#35bf28}+1.80\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.9906ms 0.6182ms 1.6175 KOps/s 1.6167 KOps/s $\color{#35bf28}+0.05\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.4192s 12.4952ms 80.0310 Ops/s 237.2895 Ops/s $\textbf{\color{#d91a1a}-66.27\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 4.9619ms 2.2599ms 442.4885 Ops/s 426.7093 Ops/s $\color{#35bf28}+3.70\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 7.8862ms 1.3886ms 720.1251 Ops/s 658.7618 Ops/s $\textbf{\color{#35bf28}+9.31\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 6.1829ms 4.1875ms 238.8060 Ops/s 227.6508 Ops/s $\color{#35bf28}+4.90\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 6.4126ms 2.2705ms 440.4394 Ops/s 431.9455 Ops/s $\color{#35bf28}+1.97\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 5.4817ms 1.2781ms 782.4107 Ops/s 796.1071 Ops/s $\color{#d91a1a}-1.72\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.3626s 11.5199ms 86.8063 Ops/s 244.0416 Ops/s $\textbf{\color{#d91a1a}-64.43\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 13.0131ms 2.4987ms 400.2045 Ops/s 408.0192 Ops/s $\color{#d91a1a}-1.92\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 5.1233ms 1.4259ms 701.2961 Ops/s 571.7916 Ops/s $\textbf{\color{#35bf28}+22.65\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 11.9789ms 10.9116ms 91.6454 Ops/s 84.9678 Ops/s $\textbf{\color{#35bf28}+7.86\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 16.5762ms 14.7241ms 67.9158 Ops/s 65.9149 Ops/s $\color{#35bf28}+3.04\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 20.6373ms 19.5332ms 51.1949 Ops/s 48.6721 Ops/s $\textbf{\color{#35bf28}+5.18\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 16.8695ms 15.0546ms 66.4249 Ops/s 61.1385 Ops/s $\textbf{\color{#35bf28}+8.65\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 21.8552ms 19.6956ms 50.7728 Ops/s 48.8920 Ops/s $\color{#35bf28}+3.85\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 18.0201ms 16.3649ms 61.1064 Ops/s 58.9180 Ops/s $\color{#35bf28}+3.71\%$

Copy link

github-actions bot commented Dec 13, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}17$. Worsened: $\large\color{#d91a1a}12$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.7491s 0.7471s 1.3384 Ops/s 1.3213 Ops/s $\color{#35bf28}+1.30\%$
test_transformed 1.0057s 1.0042s 0.9958 Ops/s 0.9858 Ops/s $\color{#35bf28}+1.01\%$
test_serial 2.1616s 2.1566s 0.4637 Ops/s 0.4606 Ops/s $\color{#35bf28}+0.67\%$
test_parallel 1.9713s 1.9579s 0.5108 Ops/s 0.5129 Ops/s $\color{#d91a1a}-0.41\%$
test_step_mdp_speed[True-True-True-True-True] 0.1448ms 40.5911μs 24.6359 KOps/s 24.4419 KOps/s $\color{#35bf28}+0.79\%$
test_step_mdp_speed[True-True-True-True-False] 0.1373ms 22.9996μs 43.4789 KOps/s 42.7232 KOps/s $\color{#35bf28}+1.77\%$
test_step_mdp_speed[True-True-True-False-True] 51.2210μs 21.9728μs 45.5107 KOps/s 45.0901 KOps/s $\color{#35bf28}+0.93\%$
test_step_mdp_speed[True-True-True-False-False] 42.7700μs 13.0746μs 76.4839 KOps/s 75.5346 KOps/s $\color{#35bf28}+1.26\%$
test_step_mdp_speed[True-True-False-True-True] 73.1610μs 42.8316μs 23.3473 KOps/s 22.7935 KOps/s $\color{#35bf28}+2.43\%$
test_step_mdp_speed[True-True-False-True-False] 67.1710μs 25.1253μs 39.8006 KOps/s 38.4985 KOps/s $\color{#35bf28}+3.38\%$
test_step_mdp_speed[True-True-False-False-True] 53.8010μs 24.3408μs 41.0833 KOps/s 40.4539 KOps/s $\color{#35bf28}+1.56\%$
test_step_mdp_speed[True-True-False-False-False] 97.8020μs 15.2238μs 65.6867 KOps/s 64.3955 KOps/s $\color{#35bf28}+2.01\%$
test_step_mdp_speed[True-False-True-True-True] 83.3410μs 45.1014μs 22.1722 KOps/s 22.1946 KOps/s $\color{#d91a1a}-0.10\%$
test_step_mdp_speed[True-False-True-True-False] 0.1276ms 27.6298μs 36.1929 KOps/s 35.4327 KOps/s $\color{#35bf28}+2.15\%$
test_step_mdp_speed[True-False-True-False-True] 65.8610μs 24.8358μs 40.2644 KOps/s 39.7074 KOps/s $\color{#35bf28}+1.40\%$
test_step_mdp_speed[True-False-True-False-False] 45.6210μs 15.1831μs 65.8628 KOps/s 64.2474 KOps/s $\color{#35bf28}+2.51\%$
test_step_mdp_speed[True-False-False-True-True] 85.0910μs 48.1494μs 20.7687 KOps/s 21.1506 KOps/s $\color{#d91a1a}-1.81\%$
test_step_mdp_speed[True-False-False-True-False] 60.3810μs 29.6446μs 33.7330 KOps/s 33.1921 KOps/s $\color{#35bf28}+1.63\%$
test_step_mdp_speed[True-False-False-False-True] 66.3010μs 26.6277μs 37.5549 KOps/s 37.1707 KOps/s $\color{#35bf28}+1.03\%$
test_step_mdp_speed[True-False-False-False-False] 58.5510μs 17.0526μs 58.6422 KOps/s 57.1307 KOps/s $\color{#35bf28}+2.65\%$
test_step_mdp_speed[False-True-True-True-True] 82.0510μs 45.0772μs 22.1841 KOps/s 22.1127 KOps/s $\color{#35bf28}+0.32\%$
test_step_mdp_speed[False-True-True-True-False] 58.4620μs 27.8553μs 35.8998 KOps/s 35.8946 KOps/s $\color{#35bf28}+0.01\%$
test_step_mdp_speed[False-True-True-False-True] 67.2610μs 28.6395μs 34.9168 KOps/s 34.8217 KOps/s $\color{#35bf28}+0.27\%$
test_step_mdp_speed[False-True-True-False-False] 48.5610μs 17.0585μs 58.6219 KOps/s 57.3333 KOps/s $\color{#35bf28}+2.25\%$
test_step_mdp_speed[False-True-False-True-True] 0.1023ms 47.5326μs 21.0382 KOps/s 20.9150 KOps/s $\color{#35bf28}+0.59\%$
test_step_mdp_speed[False-True-False-True-False] 56.4410μs 29.5853μs 33.8005 KOps/s 32.8258 KOps/s $\color{#35bf28}+2.97\%$
test_step_mdp_speed[False-True-False-False-True] 3.1310ms 30.7082μs 32.5646 KOps/s 32.4733 KOps/s $\color{#35bf28}+0.28\%$
test_step_mdp_speed[False-True-False-False-False] 48.3810μs 19.1847μs 52.1248 KOps/s 51.5125 KOps/s $\color{#35bf28}+1.19\%$
test_step_mdp_speed[False-False-True-True-True] 0.2338ms 49.4318μs 20.2299 KOps/s 20.0761 KOps/s $\color{#35bf28}+0.77\%$
test_step_mdp_speed[False-False-True-True-False] 0.2181ms 31.7269μs 31.5190 KOps/s 30.9591 KOps/s $\color{#35bf28}+1.81\%$
test_step_mdp_speed[False-False-True-False-True] 70.6410μs 30.4979μs 32.7891 KOps/s 33.1065 KOps/s $\color{#d91a1a}-0.96\%$
test_step_mdp_speed[False-False-True-False-False] 54.4210μs 19.0958μs 52.3677 KOps/s 51.2894 KOps/s $\color{#35bf28}+2.10\%$
test_step_mdp_speed[False-False-False-True-True] 79.1020μs 51.0012μs 19.6074 KOps/s 19.5973 KOps/s $\color{#35bf28}+0.05\%$
test_step_mdp_speed[False-False-False-True-False] 61.0220μs 33.8934μs 29.5043 KOps/s 28.9118 KOps/s $\color{#35bf28}+2.05\%$
test_step_mdp_speed[False-False-False-False-True] 61.1210μs 32.5589μs 30.7136 KOps/s 31.1817 KOps/s $\color{#d91a1a}-1.50\%$
test_step_mdp_speed[False-False-False-False-False] 51.1110μs 21.0568μs 47.4906 KOps/s 47.4115 KOps/s $\color{#35bf28}+0.17\%$
test_values[generalized_advantage_estimate-True-True] 26.0130ms 25.2240ms 39.6447 Ops/s 40.0297 Ops/s $\color{#d91a1a}-0.96\%$
test_values[vec_generalized_advantage_estimate-True-True] 99.5686ms 2.8861ms 346.4881 Ops/s 309.2673 Ops/s $\textbf{\color{#35bf28}+12.04\%}$
test_values[td0_return_estimate-False-False] 0.1080ms 81.3974μs 12.2854 KOps/s 12.3734 KOps/s $\color{#d91a1a}-0.71\%$
test_values[td1_return_estimate-False-False] 56.3510ms 55.6857ms 17.9579 Ops/s 18.0751 Ops/s $\color{#d91a1a}-0.65\%$
test_values[vec_td1_return_estimate-False-False] 1.3820ms 1.0859ms 920.8868 Ops/s 918.3526 Ops/s $\color{#35bf28}+0.28\%$
test_values[td_lambda_return_estimate-True-False] 89.4735ms 88.4601ms 11.3045 Ops/s 11.3790 Ops/s $\color{#d91a1a}-0.65\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.4393ms 1.0883ms 918.8433 Ops/s 920.4003 Ops/s $\color{#d91a1a}-0.17\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 25.2306ms 24.8578ms 40.2288 Ops/s 37.9301 Ops/s $\textbf{\color{#35bf28}+6.06\%}$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0660ms 0.7611ms 1.3139 KOps/s 1.3240 KOps/s $\color{#d91a1a}-0.76\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.8261ms 0.6752ms 1.4810 KOps/s 1.4816 KOps/s $\color{#d91a1a}-0.05\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.6730ms 1.4860ms 672.9597 Ops/s 674.8272 Ops/s $\color{#d91a1a}-0.28\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.8896ms 0.6877ms 1.4542 KOps/s 1.4574 KOps/s $\color{#d91a1a}-0.22\%$
test_dqn_speed[False-None] 8.2631ms 1.5418ms 648.6056 Ops/s 658.5992 Ops/s $\color{#d91a1a}-1.52\%$
test_dqn_speed[False-backward] 2.3036ms 2.1532ms 464.4161 Ops/s 467.4378 Ops/s $\color{#d91a1a}-0.65\%$
test_dqn_speed[True-None] 0.6929ms 0.5302ms 1.8861 KOps/s 1.8555 KOps/s $\color{#35bf28}+1.65\%$
test_dqn_speed[True-backward] 1.3516ms 1.1995ms 833.6804 Ops/s 824.8029 Ops/s $\color{#35bf28}+1.08\%$
test_dqn_speed[reduce-overhead-None] 0.7134ms 0.5478ms 1.8253 KOps/s 1.7372 KOps/s $\textbf{\color{#35bf28}+5.08\%}$
test_dqn_speed[reduce-overhead-backward] 1.2179ms 1.0634ms 940.3555 Ops/s 927.4808 Ops/s $\color{#35bf28}+1.39\%$
test_ddpg_speed[False-None] 3.3063ms 2.8758ms 347.7278 Ops/s 342.2360 Ops/s $\color{#35bf28}+1.60\%$
test_ddpg_speed[False-backward] 4.5792ms 4.2941ms 232.8786 Ops/s 233.8561 Ops/s $\color{#d91a1a}-0.42\%$
test_ddpg_speed[True-None] 1.2880ms 1.0614ms 942.1370 Ops/s 929.9782 Ops/s $\color{#35bf28}+1.31\%$
test_ddpg_speed[True-backward] 2.4080ms 2.2679ms 440.9433 Ops/s 433.7212 Ops/s $\color{#35bf28}+1.67\%$
test_ddpg_speed[reduce-overhead-None] 1.2427ms 1.0770ms 928.4734 Ops/s 885.5034 Ops/s $\color{#35bf28}+4.85\%$
test_ddpg_speed[reduce-overhead-backward] 1.9187ms 1.7589ms 568.5294 Ops/s 556.4368 Ops/s $\color{#35bf28}+2.17\%$
test_sac_speed[False-None] 8.5120ms 8.1216ms 123.1284 Ops/s 121.7445 Ops/s $\color{#35bf28}+1.14\%$
test_sac_speed[False-backward] 11.9759ms 11.5187ms 86.8154 Ops/s 87.5523 Ops/s $\color{#d91a1a}-0.84\%$
test_sac_speed[True-None] 1.6984ms 1.5301ms 653.5638 Ops/s 650.0693 Ops/s $\color{#35bf28}+0.54\%$
test_sac_speed[True-backward] 3.6071ms 3.4016ms 293.9758 Ops/s 291.6862 Ops/s $\color{#35bf28}+0.78\%$
test_sac_speed[reduce-overhead-None] 23.4509ms 12.9610ms 77.1547 Ops/s 77.3917 Ops/s $\color{#d91a1a}-0.31\%$
test_sac_speed[reduce-overhead-backward] 1.6357ms 1.5264ms 655.1407 Ops/s 653.0985 Ops/s $\color{#35bf28}+0.31\%$
test_redq_speed[False-None] 8.4442ms 7.6385ms 130.9161 Ops/s 129.6969 Ops/s $\color{#35bf28}+0.94\%$
test_redq_speed[False-backward] 12.9751ms 11.9491ms 83.6882 Ops/s 83.4397 Ops/s $\color{#35bf28}+0.30\%$
test_redq_speed[True-None] 2.1901ms 1.9931ms 501.7274 Ops/s 501.0488 Ops/s $\color{#35bf28}+0.14\%$
test_redq_speed[True-backward] 3.8354ms 3.6587ms 273.3196 Ops/s 272.3362 Ops/s $\color{#35bf28}+0.36\%$
test_redq_speed[reduce-overhead-None] 2.3054ms 2.0502ms 487.7470 Ops/s 498.1548 Ops/s $\color{#d91a1a}-2.09\%$
test_redq_speed[reduce-overhead-backward] 4.0437ms 3.9105ms 255.7245 Ops/s 260.6231 Ops/s $\color{#d91a1a}-1.88\%$
test_redq_deprec_speed[False-None] 9.7942ms 9.1854ms 108.8684 Ops/s 108.5790 Ops/s $\color{#35bf28}+0.27\%$
test_redq_deprec_speed[False-backward] 13.1615ms 12.5473ms 79.6981 Ops/s 80.2145 Ops/s $\color{#d91a1a}-0.64\%$
test_redq_deprec_speed[True-None] 2.7996ms 2.3881ms 418.7459 Ops/s 428.8492 Ops/s $\color{#d91a1a}-2.36\%$
test_redq_deprec_speed[True-backward] 4.4986ms 4.3273ms 231.0926 Ops/s 240.3285 Ops/s $\color{#d91a1a}-3.84\%$
test_redq_deprec_speed[reduce-overhead-None] 2.6268ms 2.3545ms 424.7207 Ops/s 428.5185 Ops/s $\color{#d91a1a}-0.89\%$
test_redq_deprec_speed[reduce-overhead-backward] 4.2613ms 4.0497ms 246.9328 Ops/s 240.1764 Ops/s $\color{#35bf28}+2.81\%$
test_td3_speed[False-None] 8.2807ms 8.0544ms 124.1555 Ops/s 123.7659 Ops/s $\color{#35bf28}+0.31\%$
test_td3_speed[False-backward] 10.9651ms 10.4811ms 95.4101 Ops/s 47.3112 Ops/s $\textbf{\color{#35bf28}+101.67\%}$
test_td3_speed[True-None] 1.5966ms 1.5711ms 636.5084 Ops/s 642.0018 Ops/s $\color{#d91a1a}-0.86\%$
test_td3_speed[True-backward] 3.3271ms 3.1266ms 319.8338 Ops/s 304.5717 Ops/s $\textbf{\color{#35bf28}+5.01\%}$
test_td3_speed[reduce-overhead-None] 84.8360ms 26.4224ms 37.8467 Ops/s 36.4922 Ops/s $\color{#35bf28}+3.71\%$
test_td3_speed[reduce-overhead-backward] 1.4619ms 1.3253ms 754.5326 Ops/s 672.1616 Ops/s $\textbf{\color{#35bf28}+12.25\%}$
test_cql_speed[False-None] 17.5042ms 16.9635ms 58.9503 Ops/s 58.6201 Ops/s $\color{#35bf28}+0.56\%$
test_cql_speed[False-backward] 22.8027ms 22.2539ms 44.9360 Ops/s 44.2435 Ops/s $\color{#35bf28}+1.57\%$
test_cql_speed[True-None] 3.1484ms 2.9393ms 340.2206 Ops/s 339.4166 Ops/s $\color{#35bf28}+0.24\%$
test_cql_speed[True-backward] 5.5289ms 5.0991ms 196.1136 Ops/s 189.3919 Ops/s $\color{#35bf28}+3.55\%$
test_cql_speed[reduce-overhead-None] 21.9456ms 13.4104ms 74.5688 Ops/s 75.4717 Ops/s $\color{#d91a1a}-1.20\%$
test_cql_speed[reduce-overhead-backward] 1.6749ms 1.5311ms 653.1388 Ops/s 651.2911 Ops/s $\color{#35bf28}+0.28\%$
test_a2c_speed[False-None] 3.5156ms 3.2471ms 307.9711 Ops/s 308.9710 Ops/s $\color{#d91a1a}-0.32\%$
test_a2c_speed[False-backward] 6.4498ms 6.2213ms 160.7391 Ops/s 160.2120 Ops/s $\color{#35bf28}+0.33\%$
test_a2c_speed[True-None] 1.1677ms 1.0038ms 996.2633 Ops/s 988.8874 Ops/s $\color{#35bf28}+0.75\%$
test_a2c_speed[True-backward] 2.7980ms 2.5937ms 385.5435 Ops/s 360.4132 Ops/s $\textbf{\color{#35bf28}+6.97\%}$
test_a2c_speed[reduce-overhead-None] 21.9002ms 11.6763ms 85.6435 Ops/s 84.1153 Ops/s $\color{#35bf28}+1.82\%$
test_a2c_speed[reduce-overhead-backward] 1.0242ms 0.9871ms 1.0131 KOps/s 866.6152 Ops/s $\textbf{\color{#35bf28}+16.90\%}$
test_ppo_speed[False-None] 3.9630ms 3.7378ms 267.5336 Ops/s 269.6645 Ops/s $\color{#d91a1a}-0.79\%$
test_ppo_speed[False-backward] 7.5600ms 6.9798ms 143.2710 Ops/s 140.3490 Ops/s $\color{#35bf28}+2.08\%$
test_ppo_speed[True-None] 1.1470ms 0.9562ms 1.0458 KOps/s 1.0472 KOps/s $\color{#d91a1a}-0.13\%$
test_ppo_speed[True-backward] 2.6988ms 2.5437ms 393.1311 Ops/s 367.9339 Ops/s $\textbf{\color{#35bf28}+6.85\%}$
test_ppo_speed[reduce-overhead-None] 0.7295ms 0.5275ms 1.8958 KOps/s 1.8823 KOps/s $\color{#35bf28}+0.72\%$
test_ppo_speed[reduce-overhead-backward] 1.0276ms 0.9688ms 1.0322 KOps/s 990.7718 Ops/s $\color{#35bf28}+4.18\%$
test_reinforce_speed[False-None] 2.4545ms 2.2924ms 436.2322 Ops/s 434.9731 Ops/s $\color{#35bf28}+0.29\%$
test_reinforce_speed[False-backward] 3.7873ms 3.3246ms 300.7852 Ops/s 301.7383 Ops/s $\color{#d91a1a}-0.32\%$
test_reinforce_speed[True-None] 1.2135ms 0.8268ms 1.2095 KOps/s 1.1966 KOps/s $\color{#35bf28}+1.07\%$
test_reinforce_speed[True-backward] 2.8195ms 2.5421ms 393.3762 Ops/s 407.2830 Ops/s $\color{#d91a1a}-3.41\%$
test_reinforce_speed[reduce-overhead-None] 22.7948ms 11.9840ms 83.4448 Ops/s 86.4558 Ops/s $\color{#d91a1a}-3.48\%$
test_reinforce_speed[reduce-overhead-backward] 1.3335ms 1.1839ms 844.6665 Ops/s 933.4796 Ops/s $\textbf{\color{#d91a1a}-9.51\%}$
test_iql_speed[False-None] 10.0537ms 9.4305ms 106.0392 Ops/s 106.7107 Ops/s $\color{#d91a1a}-0.63\%$
test_iql_speed[False-backward] 14.2779ms 13.5347ms 73.8841 Ops/s 76.1168 Ops/s $\color{#d91a1a}-2.93\%$
test_iql_speed[True-None] 2.1584ms 1.7643ms 566.8026 Ops/s 575.8591 Ops/s $\color{#d91a1a}-1.57\%$
test_iql_speed[True-backward] 4.5083ms 4.2430ms 235.6816 Ops/s 227.0791 Ops/s $\color{#35bf28}+3.79\%$
test_iql_speed[reduce-overhead-None] 15.6377ms 8.9323ms 111.9530 Ops/s 113.1922 Ops/s $\color{#d91a1a}-1.09\%$
test_iql_speed[reduce-overhead-backward] 1.5722ms 1.4457ms 691.7143 Ops/s 623.2981 Ops/s $\textbf{\color{#35bf28}+10.98\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 8.0038ms 6.5167ms 153.4530 Ops/s 150.2371 Ops/s $\color{#35bf28}+2.14\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.6833ms 0.3669ms 2.7256 KOps/s 2.8940 KOps/s $\textbf{\color{#d91a1a}-5.82\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7088ms 0.3463ms 2.8878 KOps/s 3.0601 KOps/s $\textbf{\color{#d91a1a}-5.63\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.6478ms 6.2805ms 159.2225 Ops/s 157.2043 Ops/s $\color{#35bf28}+1.28\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.9309ms 0.3433ms 2.9126 KOps/s 3.4003 KOps/s $\textbf{\color{#d91a1a}-14.34\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5244ms 0.2936ms 3.4055 KOps/s 4.0187 KOps/s $\textbf{\color{#d91a1a}-15.26\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.6566ms 1.4193ms 704.5843 Ops/s 778.6096 Ops/s $\textbf{\color{#d91a1a}-9.51\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.6719ms 1.3842ms 722.4245 Ops/s 813.6792 Ops/s $\textbf{\color{#d91a1a}-11.22\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.6765ms 6.4881ms 154.1281 Ops/s 154.4249 Ops/s $\color{#d91a1a}-0.19\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.0893ms 0.5009ms 1.9963 KOps/s 2.3685 KOps/s $\textbf{\color{#d91a1a}-15.71\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7539ms 0.4881ms 2.0488 KOps/s 2.3162 KOps/s $\textbf{\color{#d91a1a}-11.55\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.5483ms 6.3274ms 158.0437 Ops/s 158.0622 Ops/s $\color{#d91a1a}-0.01\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.6767ms 0.3438ms 2.9087 KOps/s 2.7727 KOps/s $\color{#35bf28}+4.90\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6197ms 0.2918ms 3.4274 KOps/s 2.7552 KOps/s $\textbf{\color{#35bf28}+24.40\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.6217ms 6.2689ms 159.5183 Ops/s 158.7422 Ops/s $\color{#35bf28}+0.49\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.6788ms 0.3405ms 2.9368 KOps/s 3.1328 KOps/s $\textbf{\color{#d91a1a}-6.26\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5494ms 0.2470ms 4.0488 KOps/s 3.3570 KOps/s $\textbf{\color{#35bf28}+20.61\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.7566ms 6.4531ms 154.9640 Ops/s 154.0603 Ops/s $\color{#35bf28}+0.59\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.0627ms 0.5033ms 1.9869 KOps/s 2.3357 KOps/s $\textbf{\color{#d91a1a}-14.93\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6940ms 0.4921ms 2.0322 KOps/s 2.1131 KOps/s $\color{#d91a1a}-3.83\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 6.9606ms 5.3731ms 186.1114 Ops/s 185.0376 Ops/s $\color{#35bf28}+0.58\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 9.1285ms 2.0726ms 482.4854 Ops/s 426.6678 Ops/s $\textbf{\color{#35bf28}+13.08\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 7.5556ms 1.2243ms 816.7689 Ops/s 773.8812 Ops/s $\textbf{\color{#35bf28}+5.54\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 7.3945ms 5.4941ms 182.0127 Ops/s 187.6975 Ops/s $\color{#d91a1a}-3.03\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 8.7928ms 2.0516ms 487.4193 Ops/s 440.4571 Ops/s $\textbf{\color{#35bf28}+10.66\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 7.1566ms 1.2008ms 832.7648 Ops/s 982.8375 Ops/s $\textbf{\color{#d91a1a}-15.27\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.5394s 16.3934ms 61.0000 Ops/s 31.7084 Ops/s $\textbf{\color{#35bf28}+92.38\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 12.7180ms 2.1525ms 464.5778 Ops/s 470.0380 Ops/s $\color{#d91a1a}-1.16\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 2.3019ms 1.2611ms 792.9821 Ops/s 713.3459 Ops/s $\textbf{\color{#35bf28}+11.16\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 13.9962ms 13.3187ms 75.0822 Ops/s 73.6592 Ops/s $\color{#35bf28}+1.93\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 19.2838ms 17.4246ms 57.3900 Ops/s 56.0465 Ops/s $\color{#35bf28}+2.40\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 18.1583ms 17.7227ms 56.4250 Ops/s 53.8068 Ops/s $\color{#35bf28}+4.87\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 19.6269ms 17.5937ms 56.8386 Ops/s 55.3537 Ops/s $\color{#35bf28}+2.68\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 17.9108ms 17.4663ms 57.2530 Ops/s 54.6512 Ops/s $\color{#35bf28}+4.76\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 20.7089ms 18.8015ms 53.1874 Ops/s 51.2682 Ops/s $\color{#35bf28}+3.74\%$

[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
@vmoens vmoens merged commit c628d6f into gh/vmoens/39/base Dec 14, 2024
27 of 30 checks passed
vmoens added a commit that referenced this pull request Dec 14, 2024
ghstack-source-id: 362b6e88bad4397f35036391729e58f4f7e4a25d
Pull Request resolved: #2556
@vmoens vmoens deleted the gh/vmoens/39/head branch December 14, 2024 00:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants