Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] Better check for TDModule #1248

Merged
merged 1 commit into from
Mar 5, 2025
Merged

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Mar 5, 2025

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Mar 5, 2025
ghstack-source-id: 4711140e4c0bc6f583c801cc1b8c74b862b69380
Pull Request resolved: #1248
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 5, 2025
@vmoens vmoens merged commit 304c947 into gh/vmoens/49/base Mar 5, 2025
32 of 34 checks passed
vmoens added a commit that referenced this pull request Mar 5, 2025
ghstack-source-id: 4711140e4c0bc6f583c801cc1b8c74b862b69380
Pull Request resolved: #1248
@vmoens vmoens deleted the gh/vmoens/49/head branch March 5, 2025 00:39
Copy link

github-actions bot commented Mar 5, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 217. Improved: $\large\color{#35bf28}6$. Worsened: $\large\color{#d91a1a}6$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 52.3290μs 20.3111μs 49.2342 KOps/s 48.8533 KOps/s $\color{#35bf28}+0.78\%$
test_plain_set_stack_nested 41.5680μs 20.4273μs 48.9541 KOps/s 48.8191 KOps/s $\color{#35bf28}+0.28\%$
test_plain_set_nested_inplace 54.0420μs 22.3077μs 44.8276 KOps/s 45.0328 KOps/s $\color{#d91a1a}-0.46\%$
test_plain_set_stack_nested_inplace 58.7900μs 22.3878μs 44.6672 KOps/s 44.5891 KOps/s $\color{#35bf28}+0.18\%$
test_items 27.5110μs 4.2053μs 237.7957 KOps/s 238.7460 KOps/s $\color{#d91a1a}-0.40\%$
test_items_nested 0.5815ms 0.4097ms 2.4407 KOps/s 2.4403 KOps/s $\color{#35bf28}+0.02\%$
test_items_nested_locked 0.7219ms 0.4110ms 2.4332 KOps/s 2.4261 KOps/s $\color{#35bf28}+0.29\%$
test_items_nested_leaf 0.1454ms 76.8689μs 13.0092 KOps/s 12.6366 KOps/s $\color{#35bf28}+2.95\%$
test_items_stack_nested 0.4943ms 0.4090ms 2.4448 KOps/s 2.4307 KOps/s $\color{#35bf28}+0.58\%$
test_items_stack_nested_leaf 0.1372ms 77.4353μs 12.9140 KOps/s 12.6519 KOps/s $\color{#35bf28}+2.07\%$
test_items_stack_nested_locked 0.4814ms 0.4089ms 2.4454 KOps/s 2.4285 KOps/s $\color{#35bf28}+0.70\%$
test_keys 40.9660μs 3.4496μs 289.8846 KOps/s 281.1051 KOps/s $\color{#35bf28}+3.12\%$
test_keys_nested 0.2263ms 0.1638ms 6.1033 KOps/s 6.0787 KOps/s $\color{#35bf28}+0.41\%$
test_keys_nested_locked 1.7648ms 0.1712ms 5.8408 KOps/s 5.8393 KOps/s $\color{#35bf28}+0.03\%$
test_keys_nested_leaf 0.2984ms 0.1428ms 7.0014 KOps/s 6.9367 KOps/s $\color{#35bf28}+0.93\%$
test_keys_stack_nested 0.2553ms 0.1640ms 6.0984 KOps/s 6.0468 KOps/s $\color{#35bf28}+0.85\%$
test_keys_stack_nested_leaf 0.2222ms 0.1435ms 6.9687 KOps/s 6.8369 KOps/s $\color{#35bf28}+1.93\%$
test_keys_stack_nested_locked 0.2567ms 0.1708ms 5.8562 KOps/s 5.8052 KOps/s $\color{#35bf28}+0.88\%$
test_values 5.4984μs 1.0422μs 959.5156 KOps/s 972.3919 KOps/s $\color{#d91a1a}-1.32\%$
test_values_nested 0.1257ms 62.5908μs 15.9768 KOps/s 16.1369 KOps/s $\color{#d91a1a}-0.99\%$
test_values_nested_locked 96.2110μs 62.3993μs 16.0258 KOps/s 16.1810 KOps/s $\color{#d91a1a}-0.96\%$
test_values_nested_leaf 0.1518ms 71.3401μs 14.0174 KOps/s 13.5407 KOps/s $\color{#35bf28}+3.52\%$
test_values_stack_nested 0.1089ms 61.6783μs 16.2132 KOps/s 15.9814 KOps/s $\color{#35bf28}+1.45\%$
test_values_stack_nested_leaf 0.1226ms 71.1760μs 14.0497 KOps/s 14.0328 KOps/s $\color{#35bf28}+0.12\%$
test_values_stack_nested_locked 0.2400ms 63.5881μs 15.7262 KOps/s 15.9740 KOps/s $\color{#d91a1a}-1.55\%$
test_membership 26.6200μs 0.8698μs 1.1496 MOps/s 1.1639 MOps/s $\color{#d91a1a}-1.22\%$
test_membership_nested 23.2130μs 2.9073μs 343.9562 KOps/s 346.5441 KOps/s $\color{#d91a1a}-0.75\%$
test_membership_nested_leaf 36.7690μs 2.9338μs 340.8519 KOps/s 343.6232 KOps/s $\color{#d91a1a}-0.81\%$
test_membership_stacked_nested 24.0060μs 2.8904μs 345.9676 KOps/s 341.5709 KOps/s $\color{#35bf28}+1.29\%$
test_membership_stacked_nested_leaf 16.9620μs 2.8974μs 345.1383 KOps/s 346.6222 KOps/s $\color{#d91a1a}-0.43\%$
test_membership_nested_last 25.0270μs 4.4182μs 226.3351 KOps/s 227.3534 KOps/s $\color{#d91a1a}-0.45\%$
test_membership_nested_leaf_last 22.3820μs 4.4447μs 224.9872 KOps/s 226.3181 KOps/s $\color{#d91a1a}-0.59\%$
test_membership_stacked_nested_last 30.5680μs 4.3740μs 228.6263 KOps/s 226.3402 KOps/s $\color{#35bf28}+1.01\%$
test_membership_stacked_nested_leaf_last 22.8230μs 4.3508μs 229.8451 KOps/s 226.1675 KOps/s $\color{#35bf28}+1.63\%$
test_nested_getleaf 40.4860μs 10.4861μs 95.3641 KOps/s 93.6499 KOps/s $\color{#35bf28}+1.83\%$
test_nested_get 34.5960μs 10.0095μs 99.9055 KOps/s 99.2220 KOps/s $\color{#35bf28}+0.69\%$
test_stacked_getleaf 28.3940μs 10.6652μs 93.7631 KOps/s 94.3540 KOps/s $\color{#d91a1a}-0.63\%$
test_stacked_get 35.5070μs 10.0456μs 99.5457 KOps/s 99.1040 KOps/s $\color{#35bf28}+0.45\%$
test_nested_getitemleaf 38.8530μs 11.3874μs 87.8167 KOps/s 88.5485 KOps/s $\color{#d91a1a}-0.83\%$
test_nested_getitem 30.6570μs 10.8606μs 92.0756 KOps/s 93.0829 KOps/s $\color{#d91a1a}-1.08\%$
test_stacked_getitemleaf 34.8660μs 11.3986μs 87.7300 KOps/s 89.5973 KOps/s $\color{#d91a1a}-2.08\%$
test_stacked_getitem 33.3330μs 10.7206μs 93.2780 KOps/s 93.0942 KOps/s $\color{#35bf28}+0.20\%$
test_lock_nested 0.6160ms 0.4158ms 2.4052 KOps/s 2.4410 KOps/s $\color{#d91a1a}-1.47\%$
test_lock_stack_nested 0.5492ms 0.4236ms 2.3608 KOps/s 2.2639 KOps/s $\color{#35bf28}+4.28\%$
test_unlock_nested 0.9532ms 0.3420ms 2.9237 KOps/s 2.9448 KOps/s $\color{#d91a1a}-0.72\%$
test_unlock_stack_nested 0.4782ms 0.3444ms 2.9038 KOps/s 2.7994 KOps/s $\color{#35bf28}+3.73\%$
test_flatten_speed 0.1972ms 0.1007ms 9.9272 KOps/s 9.7934 KOps/s $\color{#35bf28}+1.37\%$
test_unflatten_speed 0.7731ms 0.5269ms 1.8977 KOps/s 1.8997 KOps/s $\color{#d91a1a}-0.10\%$
test_common_ops 0.9291ms 0.7894ms 1.2667 KOps/s 1.2411 KOps/s $\color{#35bf28}+2.07\%$
test_creation 32.2410μs 2.5433μs 393.1922 KOps/s 397.8078 KOps/s $\color{#d91a1a}-1.16\%$
test_creation_empty 40.9370μs 11.3222μs 88.3223 KOps/s 87.7560 KOps/s $\color{#35bf28}+0.65\%$
test_creation_nested_1 0.1502ms 14.4026μs 69.4320 KOps/s 70.1836 KOps/s $\color{#d91a1a}-1.07\%$
test_creation_nested_2 46.8980μs 18.7086μs 53.4515 KOps/s 53.1064 KOps/s $\color{#35bf28}+0.65\%$
test_clone 52.7490μs 13.9154μs 71.8630 KOps/s 74.3327 KOps/s $\color{#d91a1a}-3.32\%$
test_getitem[int] 0.8527ms 12.8600μs 77.7605 KOps/s 75.2293 KOps/s $\color{#35bf28}+3.36\%$
test_getitem[slice_int] 0.1296ms 24.5209μs 40.7815 KOps/s 39.5017 KOps/s $\color{#35bf28}+3.24\%$
test_getitem[range] 0.1597ms 48.8126μs 20.4865 KOps/s 19.8825 KOps/s $\color{#35bf28}+3.04\%$
test_getitem[tuple] 0.1413ms 20.0762μs 49.8103 KOps/s 48.9942 KOps/s $\color{#35bf28}+1.67\%$
test_getitem[list] 0.1634ms 44.1293μs 22.6607 KOps/s 21.9217 KOps/s $\color{#35bf28}+3.37\%$
test_setitem_dim[int] 50.8360μs 24.9406μs 40.0953 KOps/s 39.6437 KOps/s $\color{#35bf28}+1.14\%$
test_setitem_dim[slice_int] 75.1710μs 51.0542μs 19.5870 KOps/s 19.0705 KOps/s $\color{#35bf28}+2.71\%$
test_setitem_dim[range] 0.1228ms 75.9069μs 13.1740 KOps/s 12.9809 KOps/s $\color{#35bf28}+1.49\%$
test_setitem_dim[tuple] 76.5330μs 40.3213μs 24.8008 KOps/s 24.4948 KOps/s $\color{#35bf28}+1.25\%$
test_setitem 78.5370μs 20.6993μs 48.3109 KOps/s 50.3734 KOps/s $\color{#d91a1a}-4.09\%$
test_set 70.8930μs 19.9893μs 50.0267 KOps/s 51.5494 KOps/s $\color{#d91a1a}-2.95\%$
test_set_shared 4.3900ms 0.1842ms 5.4285 KOps/s 5.3614 KOps/s $\color{#35bf28}+1.25\%$
test_update 0.1207ms 25.6543μs 38.9798 KOps/s 40.4934 KOps/s $\color{#d91a1a}-3.74\%$
test_update_nested 0.4130ms 40.3145μs 24.8050 KOps/s 25.0297 KOps/s $\color{#d91a1a}-0.90\%$
test_update__nested 0.1143ms 34.1678μs 29.2673 KOps/s 29.9516 KOps/s $\color{#d91a1a}-2.28\%$
test_set_nested 74.0400μs 22.2028μs 45.0394 KOps/s 46.7948 KOps/s $\color{#d91a1a}-3.75\%$
test_set_nested_new 75.4520μs 26.6751μs 37.4882 KOps/s 38.1038 KOps/s $\color{#d91a1a}-1.62\%$
test_select 0.1347ms 42.8962μs 23.3121 KOps/s 23.5897 KOps/s $\color{#d91a1a}-1.18\%$
test_select_nested 0.1241ms 63.9127μs 15.6463 KOps/s 15.5445 KOps/s $\color{#35bf28}+0.66\%$
test_exclude_nested 0.1517ms 83.1895μs 12.0208 KOps/s 12.1694 KOps/s $\color{#d91a1a}-1.22\%$
test_empty[True] 0.6107ms 0.4121ms 2.4268 KOps/s 2.4280 KOps/s $\color{#d91a1a}-0.05\%$
test_empty[False] 9.3525μs 1.4661μs 682.0940 KOps/s 710.3566 KOps/s $\color{#d91a1a}-3.98\%$
test_unbind_speed 0.3472ms 0.2711ms 3.6889 KOps/s 3.7123 KOps/s $\color{#d91a1a}-0.63\%$
test_unbind_speed_stack0 0.4587ms 0.2700ms 3.7038 KOps/s 3.7707 KOps/s $\color{#d91a1a}-1.77\%$
test_unbind_speed_stack1 0.1030s 0.7359ms 1.3589 KOps/s 1.2412 KOps/s $\textbf{\color{#35bf28}+9.48\%}$
test_split 0.1021s 1.7626ms 567.3568 Ops/s 624.2375 Ops/s $\textbf{\color{#d91a1a}-9.11\%}$
test_chunk 0.1084s 1.7796ms 561.9342 Ops/s 545.7290 Ops/s $\color{#35bf28}+2.97\%$
test_consolidate_njt[False-None] 9.1405ms 8.0254ms 124.6039 Ops/s 110.3505 Ops/s $\textbf{\color{#35bf28}+12.92\%}$
test_creation[device0] 0.2210ms 92.7132μs 10.7859 KOps/s 10.7493 KOps/s $\color{#35bf28}+0.34\%$
test_creation_from_tensor 4.2370ms 95.6709μs 10.4525 KOps/s 10.5760 KOps/s $\color{#d91a1a}-1.17\%$
test_add_one[memmap_tensor0] 71.6840μs 5.2358μs 190.9929 KOps/s 197.4902 KOps/s $\color{#d91a1a}-3.29\%$
test_contiguous[memmap_tensor0] 19.4760μs 0.5154μs 1.9402 MOps/s 1.9605 MOps/s $\color{#d91a1a}-1.04\%$
test_stack[memmap_tensor0] 20.6480μs 3.4711μs 288.0923 KOps/s 296.6551 KOps/s $\color{#d91a1a}-2.89\%$
test_memmaptd_index 1.3446ms 0.2259ms 4.4277 KOps/s 4.2980 KOps/s $\color{#35bf28}+3.02\%$
test_memmaptd_index_astensor 0.4740ms 0.3121ms 3.2037 KOps/s 3.1222 KOps/s $\color{#35bf28}+2.61\%$
test_memmaptd_index_op 1.1401ms 0.5742ms 1.7415 KOps/s 1.6675 KOps/s $\color{#35bf28}+4.44\%$
test_serialize_model 0.2169s 0.1313s 7.6167 Ops/s 8.7069 Ops/s $\textbf{\color{#d91a1a}-12.52\%}$
test_serialize_model_pickle 0.4408s 0.3995s 2.5031 Ops/s 2.4909 Ops/s $\color{#35bf28}+0.49\%$
test_serialize_weights 0.1195s 0.1132s 8.8333 Ops/s 8.7334 Ops/s $\color{#35bf28}+1.14\%$
test_serialize_weights_returnearly 0.2480s 0.1699s 5.8870 Ops/s 6.4686 Ops/s $\textbf{\color{#d91a1a}-8.99\%}$
test_serialize_weights_pickle 0.4793s 0.3867s 2.5859 Ops/s 2.5314 Ops/s $\color{#35bf28}+2.15\%$
test_serialize_weights_filesystem 0.2483s 0.1599s 6.2545 Ops/s 6.2426 Ops/s $\color{#35bf28}+0.19\%$
test_serialize_model_filesystem 0.1549s 0.1496s 6.6829 Ops/s 6.5685 Ops/s $\color{#35bf28}+1.74\%$
test_reshape_pytree 79.5000μs 25.7515μs 38.8327 KOps/s 37.8460 KOps/s $\color{#35bf28}+2.61\%$
test_reshape_td 68.4090μs 33.3171μs 30.0146 KOps/s 28.3298 KOps/s $\textbf{\color{#35bf28}+5.95\%}$
test_view_pytree 55.0530μs 25.4174μs 39.3431 KOps/s 37.9140 KOps/s $\color{#35bf28}+3.77\%$
test_view_td 0.1199ms 39.3729μs 25.3982 KOps/s 23.2585 KOps/s $\textbf{\color{#35bf28}+9.20\%}$
test_unbind_pytree 65.7740μs 28.9824μs 34.5036 KOps/s 33.8165 KOps/s $\color{#35bf28}+2.03\%$
test_unbind_td 0.3459ms 40.2621μs 24.8373 KOps/s 24.8438 KOps/s $\color{#d91a1a}-0.03\%$
test_split_pytree 59.7120μs 28.3652μs 35.2545 KOps/s 34.0023 KOps/s $\color{#35bf28}+3.68\%$
test_split_td 0.2008ms 45.7883μs 21.8396 KOps/s 21.4267 KOps/s $\color{#35bf28}+1.93\%$
test_add_pytree 0.1274ms 34.9887μs 28.5806 KOps/s 27.3858 KOps/s $\color{#35bf28}+4.36\%$
test_add_td 0.1080ms 54.8541μs 18.2302 KOps/s 17.5835 KOps/s $\color{#35bf28}+3.68\%$
test_compile_add_one_nested[tensordict-compile] 0.1558ms 68.8770μs 14.5186 KOps/s 15.0346 KOps/s $\color{#d91a1a}-3.43\%$
test_compile_add_one_nested[tensordict-eager] 0.3262ms 0.1733ms 5.7715 KOps/s 5.7760 KOps/s $\color{#d91a1a}-0.08\%$
test_compile_add_one_nested[pytree-compile] 0.1077ms 46.6753μs 21.4246 KOps/s 22.2858 KOps/s $\color{#d91a1a}-3.86\%$
test_compile_add_one_nested[pytree-eager] 0.2285ms 0.1192ms 8.3890 KOps/s 8.3465 KOps/s $\color{#35bf28}+0.51\%$
test_compile_copy_nested[tensordict-compile] 63.5800μs 28.2833μs 35.3566 KOps/s 35.9120 KOps/s $\color{#d91a1a}-1.55\%$
test_compile_copy_nested[tensordict-eager] 0.1266ms 58.2295μs 17.1734 KOps/s 17.2652 KOps/s $\color{#d91a1a}-0.53\%$
test_compile_copy_nested[pytree-compile] 0.1597ms 78.0164μs 12.8178 KOps/s 12.5295 KOps/s $\color{#35bf28}+2.30\%$
test_compile_copy_nested[pytree-eager] 0.1837ms 66.0144μs 15.1482 KOps/s 14.8407 KOps/s $\color{#35bf28}+2.07\%$
test_compile_add_one_flat[tensordict-compile] 0.2644ms 0.1084ms 9.2232 KOps/s 9.4660 KOps/s $\color{#d91a1a}-2.57\%$
test_compile_add_one_flat[tensordict-eager] 0.3540ms 0.2177ms 4.5933 KOps/s 4.6549 KOps/s $\color{#d91a1a}-1.32\%$
test_compile_add_one_flat[tensorclass-compile] 0.1009ms 48.1004μs 20.7899 KOps/s 21.5920 KOps/s $\color{#d91a1a}-3.71\%$
test_compile_add_one_flat[tensorclass-eager] 0.1948ms 67.2410μs 14.8719 KOps/s 14.9099 KOps/s $\color{#d91a1a}-0.25\%$
test_compile_add_one_flat[pytree-compile] 0.1845ms 0.1015ms 9.8517 KOps/s 9.7069 KOps/s $\color{#35bf28}+1.49\%$
test_compile_add_one_flat[pytree-eager] 0.3262ms 0.2040ms 4.9021 KOps/s 4.9217 KOps/s $\color{#d91a1a}-0.40\%$
test_compile_add_self_flat[tensordict-eager] 0.5096ms 0.2354ms 4.2486 KOps/s 4.2823 KOps/s $\color{#d91a1a}-0.79\%$
test_compile_add_self_flat[tensordict-compile] 0.1904ms 0.1079ms 9.2640 KOps/s 9.4378 KOps/s $\color{#d91a1a}-1.84\%$
test_compile_add_self_flat[tensorclass-eager] 0.6815ms 65.0018μs 15.3842 KOps/s 15.9142 KOps/s $\color{#d91a1a}-3.33\%$
test_compile_add_self_flat[tensorclass-compile] 0.1115ms 49.5502μs 20.1816 KOps/s 20.6279 KOps/s $\color{#d91a1a}-2.16\%$
test_compile_add_self_flat[pytree-eager] 0.3563ms 0.1582ms 6.3214 KOps/s 6.3577 KOps/s $\color{#d91a1a}-0.57\%$
test_compile_add_self_flat[pytree-compile] 0.2569ms 0.1025ms 9.7553 KOps/s 9.9359 KOps/s $\color{#d91a1a}-1.82\%$
test_compile_copy_flat[tensordict-compile] 55.7140μs 21.6685μs 46.1499 KOps/s 47.6143 KOps/s $\color{#d91a1a}-3.08\%$
test_compile_copy_flat[tensordict-eager] 0.1348ms 66.9235μs 14.9424 KOps/s 15.0730 KOps/s $\color{#d91a1a}-0.87\%$
test_compile_copy_flat[pytree-compile] 0.1674ms 80.9027μs 12.3605 KOps/s 12.3052 KOps/s $\color{#35bf28}+0.45\%$
test_compile_copy_flat[pytree-eager] 0.1297ms 67.5564μs 14.8024 KOps/s 14.6371 KOps/s $\color{#35bf28}+1.13\%$
test_compile_assign_and_add[tensordict-compile] 0.3195ms 0.2181ms 4.5849 KOps/s 4.6285 KOps/s $\color{#d91a1a}-0.94\%$
test_compile_assign_and_add[tensordict-eager] 1.7921ms 1.3739ms 727.8399 Ops/s 737.6546 Ops/s $\color{#d91a1a}-1.33\%$
test_compile_assign_and_add[pytree-compile] 0.4236ms 0.2137ms 4.6794 KOps/s 4.6864 KOps/s $\color{#d91a1a}-0.15\%$
test_compile_assign_and_add[pytree-eager] 1.4612ms 0.8368ms 1.1950 KOps/s 1.2043 KOps/s $\color{#d91a1a}-0.77\%$
test_compile_assign_and_add_stack[compile] 0.6143ms 0.4700ms 2.1275 KOps/s 2.2258 KOps/s $\color{#d91a1a}-4.41\%$
test_compile_assign_and_add_stack[eager] 4.0377ms 2.6225ms 381.3139 Ops/s 367.2013 Ops/s $\color{#35bf28}+3.84\%$
test_compile_indexing[tensor-tensordict-compile] 89.1270μs 39.5485μs 25.2854 KOps/s 26.1493 KOps/s $\color{#d91a1a}-3.30\%$
test_compile_indexing[tensor-tensordict-eager] 0.6156ms 32.4956μs 30.7734 KOps/s 30.9850 KOps/s $\color{#d91a1a}-0.68\%$
test_compile_indexing[tensor-tensorclass-compile] 78.9490μs 30.9280μs 32.3332 KOps/s 33.3140 KOps/s $\color{#d91a1a}-2.94\%$
test_compile_indexing[tensor-tensorclass-eager] 74.2490μs 22.8054μs 43.8492 KOps/s 42.4077 KOps/s $\color{#35bf28}+3.40\%$
test_compile_indexing[tensor-pytree-compile] 76.2030μs 32.0305μs 31.2202 KOps/s 32.7956 KOps/s $\color{#d91a1a}-4.80\%$
test_compile_indexing[tensor-pytree-eager] 83.8000μs 22.2434μs 44.9571 KOps/s 42.1066 KOps/s $\textbf{\color{#35bf28}+6.77\%}$
test_compile_indexing[slice-tensordict-compile] 0.3768ms 54.6970μs 18.2825 KOps/s 19.3308 KOps/s $\textbf{\color{#d91a1a}-5.42\%}$
test_compile_indexing[slice-tensordict-eager] 0.3784ms 20.2070μs 49.4877 KOps/s 48.0563 KOps/s $\color{#35bf28}+2.98\%$
test_compile_indexing[slice-tensorclass-compile] 0.1040ms 46.3936μs 21.5547 KOps/s 22.1224 KOps/s $\color{#d91a1a}-2.57\%$
test_compile_indexing[slice-tensorclass-eager] 63.0090μs 18.2741μs 54.7223 KOps/s 53.0918 KOps/s $\color{#35bf28}+3.07\%$
test_compile_indexing[slice-pytree-compile] 0.1111ms 46.7204μs 21.4039 KOps/s 21.8382 KOps/s $\color{#d91a1a}-1.99\%$
test_compile_indexing[slice-pytree-eager] 63.7500μs 18.2026μs 54.9373 KOps/s 53.1051 KOps/s $\color{#35bf28}+3.45\%$
test_compile_indexing[int-tensordict-compile] 0.1690ms 56.9489μs 17.5596 KOps/s 19.1505 KOps/s $\textbf{\color{#d91a1a}-8.31\%}$
test_compile_indexing[int-tensordict-eager] 1.0518ms 20.1064μs 49.7354 KOps/s 50.3732 KOps/s $\color{#d91a1a}-1.27\%$
test_compile_indexing[int-tensorclass-compile] 0.1489ms 47.7848μs 20.9272 KOps/s 21.5614 KOps/s $\color{#d91a1a}-2.94\%$
test_compile_indexing[int-tensorclass-eager] 0.6183ms 18.3183μs 54.5903 KOps/s 53.4115 KOps/s $\color{#35bf28}+2.21\%$
test_compile_indexing[int-pytree-compile] 0.1112ms 46.8999μs 21.3220 KOps/s 21.7827 KOps/s $\color{#d91a1a}-2.11\%$
test_compile_indexing[int-pytree-eager] 63.7700μs 18.4013μs 54.3439 KOps/s 53.5394 KOps/s $\color{#35bf28}+1.50\%$
test_mod_add[eager] 87.6640μs 35.0573μs 28.5247 KOps/s 27.9442 KOps/s $\color{#35bf28}+2.08\%$
test_mod_add[compile] 0.1319ms 66.4087μs 15.0583 KOps/s 15.2168 KOps/s $\color{#d91a1a}-1.04\%$
test_mod_add[compile-overhead] 0.1342ms 64.9696μs 15.3918 KOps/s 15.1240 KOps/s $\color{#35bf28}+1.77\%$
test_mod_wrap[eager] 0.3681ms 0.2274ms 4.3984 KOps/s 4.5333 KOps/s $\color{#d91a1a}-2.97\%$
test_mod_wrap[compile] 2.0651ms 0.2293ms 4.3607 KOps/s 4.4225 KOps/s $\color{#d91a1a}-1.40\%$
test_mod_wrap[compile-overhead] 0.3354ms 0.2224ms 4.4964 KOps/s 4.2919 KOps/s $\color{#35bf28}+4.77\%$
test_mod_wrap_and_backward[eager] 12.4193ms 11.1348ms 89.8089 Ops/s 90.9877 Ops/s $\color{#d91a1a}-1.30\%$
test_mod_wrap_and_backward[compile] 13.0898ms 10.9797ms 91.0769 Ops/s 89.2385 Ops/s $\color{#35bf28}+2.06\%$
test_mod_wrap_and_backward[compile-overhead] 12.4797ms 10.9011ms 91.7336 Ops/s 92.3891 Ops/s $\color{#d91a1a}-0.71\%$
test_seq_add[eager] 0.2510ms 0.1169ms 8.5528 KOps/s 8.5955 KOps/s $\color{#d91a1a}-0.50\%$
test_seq_add[compile] 0.1400ms 75.9588μs 13.1650 KOps/s 13.1764 KOps/s $\color{#d91a1a}-0.09\%$
test_seq_add[compile-overhead] 0.1772ms 74.7958μs 13.3697 KOps/s 13.3564 KOps/s $\color{#35bf28}+0.10\%$
test_seq_wrap[eager] 0.5419ms 0.4412ms 2.2664 KOps/s 2.2635 KOps/s $\color{#35bf28}+0.13\%$
test_seq_wrap[compile] 0.7977ms 0.2457ms 4.0707 KOps/s 4.0482 KOps/s $\color{#35bf28}+0.56\%$
test_seq_wrap[compile-overhead] 0.3681ms 0.2410ms 4.1489 KOps/s 4.1482 KOps/s $\color{#35bf28}+0.02\%$
test_func_call_runtime[False-eager] 0.9321ms 0.5428ms 1.8422 KOps/s 1.9049 KOps/s $\color{#d91a1a}-3.29\%$
test_func_call_runtime[False-compile] 1.0179ms 0.4483ms 2.2304 KOps/s 2.2596 KOps/s $\color{#d91a1a}-1.29\%$
test_func_call_runtime[False-compile-overhead] 0.9108ms 0.4467ms 2.2385 KOps/s 2.2547 KOps/s $\color{#d91a1a}-0.72\%$
test_func_call_runtime[True-eager] 1.1049ms 0.7684ms 1.3013 KOps/s 1.3421 KOps/s $\color{#d91a1a}-3.04\%$
test_func_call_runtime[True-compile] 0.6612ms 0.4688ms 2.1333 KOps/s 2.1379 KOps/s $\color{#d91a1a}-0.22\%$
test_func_call_runtime[True-compile-overhead] 0.6622ms 0.4682ms 2.1358 KOps/s 2.1400 KOps/s $\color{#d91a1a}-0.20\%$
test_func_call_cm_runtime[False-eager] 0.6545ms 0.5456ms 1.8327 KOps/s 1.9128 KOps/s $\color{#d91a1a}-4.19\%$
test_func_call_cm_runtime[False-compile] 0.7746ms 0.4488ms 2.2282 KOps/s 2.2431 KOps/s $\color{#d91a1a}-0.67\%$
test_func_call_cm_runtime[False-compile-overhead] 0.7715ms 0.4479ms 2.2328 KOps/s 2.2593 KOps/s $\color{#d91a1a}-1.17\%$
test_func_call_cm_runtime[True-eager] 2.7454ms 0.9207ms 1.0861 KOps/s 1.1301 KOps/s $\color{#d91a1a}-3.89\%$
test_func_call_cm_runtime[True-compile] 0.9387ms 0.8026ms 1.2460 KOps/s 1.2591 KOps/s $\color{#d91a1a}-1.05\%$
test_func_call_cm_runtime[True-compile-overhead] 1.3289ms 0.8197ms 1.2200 KOps/s 1.2600 KOps/s $\color{#d91a1a}-3.18\%$
test_vmap_func_call_cm_runtime[eager] 2.6924ms 1.9431ms 514.6297 Ops/s 521.6021 Ops/s $\color{#d91a1a}-1.34\%$
test_vmap_func_call_cm_runtime[compile] 1.0685ms 0.5394ms 1.8538 KOps/s 1.8695 KOps/s $\color{#d91a1a}-0.84\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.6984ms 0.5387ms 1.8562 KOps/s 1.8764 KOps/s $\color{#d91a1a}-1.08\%$
test_distributed 0.2941ms 0.1273ms 7.8547 KOps/s 7.9221 KOps/s $\color{#d91a1a}-0.85\%$
test_tdmodule 63.8400μs 26.2651μs 38.0733 KOps/s 36.7497 KOps/s $\color{#35bf28}+3.60\%$
test_tdmodule_dispatch 86.4320μs 48.7629μs 20.5074 KOps/s 20.3732 KOps/s $\color{#35bf28}+0.66\%$
test_tdseq 68.5180μs 28.0967μs 35.5913 KOps/s 34.7015 KOps/s $\color{#35bf28}+2.56\%$
test_tdseq_dispatch 86.0720μs 53.0820μs 18.8388 KOps/s 18.6419 KOps/s $\color{#35bf28}+1.06\%$
test_instantiation_functorch 1.7565ms 1.5518ms 644.4286 Ops/s 654.3876 Ops/s $\color{#d91a1a}-1.52\%$
test_exec_functorch 0.2569ms 0.1771ms 5.6463 KOps/s 5.6465 KOps/s $-0.00\%$
test_exec_functional_call 0.3329ms 0.1751ms 5.7106 KOps/s 5.8753 KOps/s $\color{#d91a1a}-2.80\%$
test_exec_td_decorator 0.4935ms 0.2366ms 4.2272 KOps/s 4.3207 KOps/s $\color{#d91a1a}-2.16\%$
test_vmap_mlp_speed_decorator[True-True] 0.9247ms 0.6712ms 1.4899 KOps/s 1.5294 KOps/s $\color{#d91a1a}-2.58\%$
test_vmap_mlp_speed_decorator[True-False] 0.9648ms 0.6848ms 1.4602 KOps/s 1.5169 KOps/s $\color{#d91a1a}-3.74\%$
test_vmap_mlp_speed_decorator[False-True] 0.8414ms 0.5458ms 1.8322 KOps/s 1.8757 KOps/s $\color{#d91a1a}-2.32\%$
test_vmap_mlp_speed_decorator[False-False] 0.8022ms 0.5440ms 1.8383 KOps/s 1.8776 KOps/s $\color{#d91a1a}-2.09\%$
test_to_module_speed[True] 1.8474ms 1.3536ms 738.7540 Ops/s 755.3086 Ops/s $\color{#d91a1a}-2.19\%$
test_to_module_speed[False] 1.7910ms 1.3057ms 765.8803 Ops/s 769.0847 Ops/s $\color{#d91a1a}-0.42\%$
test_tc_init 85.3400μs 46.0708μs 21.7057 KOps/s 22.1106 KOps/s $\color{#d91a1a}-1.83\%$
test_tc_init_nested 0.1455ms 90.2342μs 11.0823 KOps/s 10.8933 KOps/s $\color{#35bf28}+1.73\%$
test_tc_first_layer_tensor 16.2500μs 1.5459μs 646.8518 KOps/s 654.8060 KOps/s $\color{#d91a1a}-1.21\%$
test_tc_first_layer_nontensor 21.8910μs 4.6621μs 214.4938 KOps/s 205.9891 KOps/s $\color{#35bf28}+4.13\%$
test_tc_second_layer_tensor 41.8480μs 2.8705μs 348.3661 KOps/s 351.5389 KOps/s $\color{#d91a1a}-0.90\%$
test_tc_second_layer_nontensor 44.3030μs 5.9937μs 166.8412 KOps/s 159.6052 KOps/s $\color{#35bf28}+4.53\%$
test_unbind 0.2376s 13.4895ms 74.1316 Ops/s 81.3182 Ops/s $\textbf{\color{#d91a1a}-8.84\%}$
test_full_like 10.2405ms 7.5458ms 132.5246 Ops/s 129.6398 Ops/s $\color{#35bf28}+2.23\%$
test_zeros_like 5.4139ms 4.4040ms 227.0678 Ops/s 217.8449 Ops/s $\color{#35bf28}+4.23\%$
test_ones_like 3.8437ms 3.1109ms 321.4537 Ops/s 194.8564 Ops/s $\textbf{\color{#35bf28}+64.97\%}$
test_clone 6.0568ms 4.8785ms 204.9809 Ops/s 197.6984 Ops/s $\color{#35bf28}+3.68\%$
test_squeeze 60.7840μs 12.4568μs 80.2773 KOps/s 79.1883 KOps/s $\color{#35bf28}+1.38\%$
test_unsqueeze 0.3025ms 95.6735μs 10.4522 KOps/s 10.6116 KOps/s $\color{#d91a1a}-1.50\%$
test_split 0.3120ms 0.1955ms 5.1154 KOps/s 5.1710 KOps/s $\color{#d91a1a}-1.08\%$
test_permute 0.3327ms 0.1989ms 5.0268 KOps/s 5.0110 KOps/s $\color{#35bf28}+0.32\%$
test_stack 29.1951ms 24.3494ms 41.0688 Ops/s 40.0692 Ops/s $\color{#35bf28}+2.49\%$
test_cat 26.9454ms 24.1835ms 41.3505 Ops/s 40.5290 Ops/s $\color{#35bf28}+2.03\%$

Copy link

github-actions bot commented Mar 5, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 229. Improved: $\large\color{#35bf28}27$. Worsened: $\large\color{#d91a1a}9$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 34.7010μs 11.6116μs 86.1211 KOps/s 76.6752 KOps/s $\textbf{\color{#35bf28}+12.32\%}$
test_plain_set_stack_nested 60.6420μs 11.6621μs 85.7480 KOps/s 76.4804 KOps/s $\textbf{\color{#35bf28}+12.12\%}$
test_plain_set_nested_inplace 38.7110μs 12.7374μs 78.5087 KOps/s 71.2184 KOps/s $\textbf{\color{#35bf28}+10.24\%}$
test_plain_set_stack_nested_inplace 36.8100μs 12.6286μs 79.1854 KOps/s 71.5294 KOps/s $\textbf{\color{#35bf28}+10.70\%}$
test_items 22.3200μs 2.8653μs 348.9998 KOps/s 346.3076 KOps/s $\color{#35bf28}+0.78\%$
test_items_nested 0.4300ms 0.3764ms 2.6565 KOps/s 2.7225 KOps/s $\color{#d91a1a}-2.43\%$
test_items_nested_locked 0.4329ms 0.3764ms 2.6565 KOps/s 2.6975 KOps/s $\color{#d91a1a}-1.52\%$
test_items_nested_leaf 0.1029ms 60.2129μs 16.6077 KOps/s 16.4152 KOps/s $\color{#35bf28}+1.17\%$
test_items_stack_nested 0.4286ms 0.3721ms 2.6873 KOps/s 2.7324 KOps/s $\color{#d91a1a}-1.65\%$
test_items_stack_nested_leaf 94.0120μs 61.0668μs 16.3755 KOps/s 16.5214 KOps/s $\color{#d91a1a}-0.88\%$
test_items_stack_nested_locked 0.4440ms 0.3762ms 2.6583 KOps/s 2.7311 KOps/s $\color{#d91a1a}-2.67\%$
test_keys 25.7900μs 3.4219μs 292.2323 KOps/s 289.5877 KOps/s $\color{#35bf28}+0.91\%$
test_keys_nested 0.1176ms 89.7166μs 11.1462 KOps/s 11.1345 KOps/s $\color{#35bf28}+0.11\%$
test_keys_nested_locked 0.7810ms 95.2798μs 10.4954 KOps/s 10.4793 KOps/s $\color{#35bf28}+0.15\%$
test_keys_nested_leaf 0.1087ms 80.5427μs 12.4158 KOps/s 12.4536 KOps/s $\color{#d91a1a}-0.30\%$
test_keys_stack_nested 0.1122ms 89.6281μs 11.1572 KOps/s 11.2202 KOps/s $\color{#d91a1a}-0.56\%$
test_keys_stack_nested_leaf 0.1120ms 80.5940μs 12.4079 KOps/s 12.5477 KOps/s $\color{#d91a1a}-1.11\%$
test_keys_stack_nested_locked 0.1329ms 94.8870μs 10.5389 KOps/s 10.4335 KOps/s $\color{#35bf28}+1.01\%$
test_values 4.6502μs 0.8512μs 1.1748 MOps/s 1.1716 MOps/s $\color{#35bf28}+0.28\%$
test_values_nested 62.0110μs 38.0065μs 26.3113 KOps/s 26.6004 KOps/s $\color{#d91a1a}-1.09\%$
test_values_nested_locked 63.1610μs 39.6263μs 25.2357 KOps/s 25.3146 KOps/s $\color{#d91a1a}-0.31\%$
test_values_nested_leaf 67.7910μs 43.5217μs 22.9770 KOps/s 23.3639 KOps/s $\color{#d91a1a}-1.66\%$
test_values_stack_nested 0.1067ms 38.0776μs 26.2622 KOps/s 26.3205 KOps/s $\color{#d91a1a}-0.22\%$
test_values_stack_nested_leaf 77.2820μs 43.8688μs 22.7952 KOps/s 23.0705 KOps/s $\color{#d91a1a}-1.19\%$
test_values_stack_nested_locked 70.2020μs 39.9714μs 25.0179 KOps/s 25.3137 KOps/s $\color{#d91a1a}-1.17\%$
test_membership 1.6100μs 0.5010μs 1.9958 MOps/s 1.9934 MOps/s $\color{#35bf28}+0.12\%$
test_membership_nested 18.0305μs 2.0081μs 497.9815 KOps/s 477.5331 KOps/s $\color{#35bf28}+4.28\%$
test_membership_nested_leaf 15.5505μs 2.0169μs 495.8054 KOps/s 493.1493 KOps/s $\color{#35bf28}+0.54\%$
test_membership_stacked_nested 35.4600μs 2.1011μs 475.9402 KOps/s 464.0376 KOps/s $\color{#35bf28}+2.57\%$
test_membership_stacked_nested_leaf 28.9510μs 2.0840μs 479.8375 KOps/s 465.1463 KOps/s $\color{#35bf28}+3.16\%$
test_membership_nested_last 23.9710μs 3.1171μs 320.8080 KOps/s 315.4772 KOps/s $\color{#35bf28}+1.69\%$
test_membership_nested_leaf_last 36.5810μs 3.1290μs 319.5919 KOps/s 319.6677 KOps/s $\color{#d91a1a}-0.02\%$
test_membership_stacked_nested_last 25.7310μs 3.1087μs 321.6799 KOps/s 325.7841 KOps/s $\color{#d91a1a}-1.26\%$
test_membership_stacked_nested_leaf_last 29.1400μs 3.1004μs 322.5357 KOps/s 320.7879 KOps/s $\color{#35bf28}+0.54\%$
test_nested_getleaf 28.4710μs 6.2129μs 160.9549 KOps/s 160.6112 KOps/s $\color{#35bf28}+0.21\%$
test_nested_get 29.3300μs 5.9319μs 168.5801 KOps/s 165.0711 KOps/s $\color{#35bf28}+2.13\%$
test_stacked_getleaf 27.6200μs 6.1082μs 163.7152 KOps/s 161.2054 KOps/s $\color{#35bf28}+1.56\%$
test_stacked_get 30.7410μs 5.7508μs 173.8899 KOps/s 171.2580 KOps/s $\color{#35bf28}+1.54\%$
test_nested_getitemleaf 35.1810μs 6.3784μs 156.7799 KOps/s 152.3824 KOps/s $\color{#35bf28}+2.89\%$
test_nested_getitem 30.9200μs 6.0056μs 166.5117 KOps/s 161.8846 KOps/s $\color{#35bf28}+2.86\%$
test_stacked_getitemleaf 30.7210μs 6.3793μs 156.7581 KOps/s 155.5627 KOps/s $\color{#35bf28}+0.77\%$
test_stacked_getitem 40.8800μs 5.9341μs 168.5182 KOps/s 164.5026 KOps/s $\color{#35bf28}+2.44\%$
test_lock_nested 9.1417ms 0.3564ms 2.8061 KOps/s 2.8922 KOps/s $\color{#d91a1a}-2.98\%$
test_lock_stack_nested 0.4054ms 0.3535ms 2.8289 KOps/s 2.8483 KOps/s $\color{#d91a1a}-0.68\%$
test_unlock_nested 0.4210ms 0.2946ms 3.3943 KOps/s 3.4574 KOps/s $\color{#d91a1a}-1.82\%$
test_unlock_stack_nested 0.3355ms 0.2905ms 3.4419 KOps/s 3.4392 KOps/s $\color{#35bf28}+0.08\%$
test_flatten_speed 0.1103ms 77.7566μs 12.8606 KOps/s 12.6976 KOps/s $\color{#35bf28}+1.28\%$
test_unflatten_speed 0.3892ms 0.3275ms 3.0538 KOps/s 3.0954 KOps/s $\color{#d91a1a}-1.34\%$
test_common_ops 0.7307ms 0.5923ms 1.6884 KOps/s 1.5603 KOps/s $\textbf{\color{#35bf28}+8.21\%}$
test_creation 0.1215ms 1.7349μs 576.3896 KOps/s 577.5001 KOps/s $\color{#d91a1a}-0.19\%$
test_creation_empty 38.3110μs 6.4441μs 155.1803 KOps/s 111.5667 KOps/s $\textbf{\color{#35bf28}+39.09\%}$
test_creation_nested_1 34.3210μs 8.0500μs 124.2237 KOps/s 94.7314 KOps/s $\textbf{\color{#35bf28}+31.13\%}$
test_creation_nested_2 33.7910μs 10.7026μs 93.4352 KOps/s 74.3410 KOps/s $\textbf{\color{#35bf28}+25.68\%}$
test_clone 44.3210μs 11.0953μs 90.1281 KOps/s 89.6506 KOps/s $\color{#35bf28}+0.53\%$
test_getitem[int] 1.2519ms 10.7434μs 93.0807 KOps/s 90.7623 KOps/s $\color{#35bf28}+2.55\%$
test_getitem[slice_int] 0.1111ms 21.4925μs 46.5278 KOps/s 46.0558 KOps/s $\color{#35bf28}+1.02\%$
test_getitem[range] 0.1361ms 38.5444μs 25.9441 KOps/s 26.5714 KOps/s $\color{#d91a1a}-2.36\%$
test_getitem[tuple] 0.1169ms 18.4196μs 54.2899 KOps/s 53.6793 KOps/s $\color{#35bf28}+1.14\%$
test_getitem[list] 0.1273ms 33.0418μs 30.2647 KOps/s 30.1278 KOps/s $\color{#35bf28}+0.45\%$
test_setitem_dim[int] 43.3510μs 19.3978μs 51.5522 KOps/s 49.9502 KOps/s $\color{#35bf28}+3.21\%$
test_setitem_dim[slice_int] 62.7410μs 38.9340μs 25.6845 KOps/s 25.5408 KOps/s $\color{#35bf28}+0.56\%$
test_setitem_dim[range] 79.2810μs 52.2995μs 19.1206 KOps/s 18.8262 KOps/s $\color{#35bf28}+1.56\%$
test_setitem_dim[tuple] 52.8210μs 31.9115μs 31.3366 KOps/s 31.0041 KOps/s $\color{#35bf28}+1.07\%$
test_setitem 50.9310μs 14.5632μs 68.6662 KOps/s 60.6221 KOps/s $\textbf{\color{#35bf28}+13.27\%}$
test_set 56.0810μs 13.8587μs 72.1567 KOps/s 62.3675 KOps/s $\textbf{\color{#35bf28}+15.70\%}$
test_set_shared 0.5138ms 0.1593ms 6.2756 KOps/s 6.2590 KOps/s $\color{#35bf28}+0.27\%$
test_update 0.4135ms 17.3837μs 57.5250 KOps/s 47.4522 KOps/s $\textbf{\color{#35bf28}+21.23\%}$
test_update_nested 76.7710μs 26.5433μs 37.6744 KOps/s 33.2422 KOps/s $\textbf{\color{#35bf28}+13.33\%}$
test_update__nested 0.4823ms 25.8514μs 38.6826 KOps/s 38.3309 KOps/s $\color{#35bf28}+0.92\%$
test_set_nested 64.2810μs 15.4240μs 64.8340 KOps/s 57.3906 KOps/s $\textbf{\color{#35bf28}+12.97\%}$
test_set_nested_new 52.5910μs 17.6263μs 56.7333 KOps/s 50.5187 KOps/s $\textbf{\color{#35bf28}+12.30\%}$
test_select 59.7910μs 29.1847μs 34.2646 KOps/s 31.3049 KOps/s $\textbf{\color{#35bf28}+9.45\%}$
test_select_nested 72.4410μs 44.1547μs 22.6477 KOps/s 22.9345 KOps/s $\color{#d91a1a}-1.25\%$
test_exclude_nested 89.8020μs 63.7760μs 15.6799 KOps/s 15.6641 KOps/s $\color{#35bf28}+0.10\%$
test_empty[True] 0.3981ms 0.2954ms 3.3847 KOps/s 3.3453 KOps/s $\color{#35bf28}+1.18\%$
test_empty[False] 3.6471μs 0.8203μs 1.2190 MOps/s 1.2116 MOps/s $\color{#35bf28}+0.61\%$
test_to 91.4620μs 56.9141μs 17.5703 KOps/s 18.0445 KOps/s $\color{#d91a1a}-2.63\%$
test_to_nonblocking 79.9120μs 47.5494μs 21.0308 KOps/s 21.2231 KOps/s $\color{#d91a1a}-0.91\%$
test_unbind_speed 0.2867ms 0.2504ms 3.9938 KOps/s 4.0393 KOps/s $\color{#d91a1a}-1.13\%$
test_unbind_speed_stack0 0.3192ms 0.2459ms 4.0659 KOps/s 4.0337 KOps/s $\color{#35bf28}+0.80\%$
test_unbind_speed_stack1 93.3927ms 0.7566ms 1.3217 KOps/s 1.3329 KOps/s $\color{#d91a1a}-0.84\%$
test_split 96.0199ms 1.6145ms 619.3939 Ops/s 625.2970 Ops/s $\color{#d91a1a}-0.94\%$
test_chunk 95.4910ms 1.6085ms 621.6920 Ops/s 620.2804 Ops/s $\color{#35bf28}+0.23\%$
test_consolidate[False-None] 3.3884ms 2.7195ms 367.7152 Ops/s 365.6155 Ops/s $\color{#35bf28}+0.57\%$
test_consolidate[default-None] 1.7867ms 1.6872ms 592.6961 Ops/s 588.3386 Ops/s $\color{#35bf28}+0.74\%$
test_consolidate[reduce-overhead-None] 1.8074ms 1.7264ms 579.2349 Ops/s 572.9935 Ops/s $\color{#35bf28}+1.09\%$
test_consolidate_njt[False-None] 6.8981ms 6.5231ms 153.3012 Ops/s 154.3583 Ops/s $\color{#d91a1a}-0.68\%$
test_to[False-False-None] 1.8352ms 1.7370ms 575.7026 Ops/s 582.3525 Ops/s $\color{#d91a1a}-1.14\%$
test_to[True-False-None] 1.6006ms 1.3857ms 721.6372 Ops/s 745.9979 Ops/s $\color{#d91a1a}-3.27\%$
test_to[within-False-None] 4.4632ms 4.2200ms 236.9649 Ops/s 238.1173 Ops/s $\color{#d91a1a}-0.48\%$
test_to[True-default-None] 5.5349ms 5.1750ms 193.2370 Ops/s 189.1101 Ops/s $\color{#35bf28}+2.18\%$
test_to_njt[False-False-None] 7.1052ms 6.9211ms 144.4852 Ops/s 145.6563 Ops/s $\color{#d91a1a}-0.80\%$
test_to_njt[True-False-None] 6.1730ms 5.5530ms 180.0837 Ops/s 184.0407 Ops/s $\color{#d91a1a}-2.15\%$
test_to_njt[within-False-None] 12.4064ms 12.0953ms 82.6767 Ops/s 82.7111 Ops/s $\color{#d91a1a}-0.04\%$
test_creation[device0] 0.4607ms 79.4542μs 12.5859 KOps/s 12.5805 KOps/s $\color{#35bf28}+0.04\%$
test_creation_from_tensor 0.5387ms 83.1333μs 12.0289 KOps/s 11.8738 KOps/s $\color{#35bf28}+1.31\%$
test_add_one[memmap_tensor0] 0.3374ms 7.1251μs 140.3480 KOps/s 142.0584 KOps/s $\color{#d91a1a}-1.20\%$
test_contiguous[memmap_tensor0] 1.7526μs 0.4043μs 2.4733 MOps/s 2.4071 MOps/s $\color{#35bf28}+2.75\%$
test_stack[memmap_tensor0] 41.2610μs 4.6812μs 213.6210 KOps/s 211.1546 KOps/s $\color{#35bf28}+1.17\%$
test_memmaptd_index 1.4697ms 0.2448ms 4.0845 KOps/s 4.0891 KOps/s $\color{#d91a1a}-0.11\%$
test_memmaptd_index_astensor 0.4573ms 0.3084ms 3.2422 KOps/s 3.2489 KOps/s $\color{#d91a1a}-0.21\%$
test_memmaptd_index_op 0.8230ms 0.5597ms 1.7868 KOps/s 1.6702 KOps/s $\textbf{\color{#35bf28}+6.98\%}$
test_serialize_model 0.1321s 0.1309s 7.6377 Ops/s 7.6407 Ops/s $\color{#d91a1a}-0.04\%$
test_serialize_model_pickle 1.3505s 1.2145s 0.8234 Ops/s 0.8434 Ops/s $\color{#d91a1a}-2.37\%$
test_serialize_weights 0.1319s 0.1310s 7.6318 Ops/s 7.6691 Ops/s $\color{#d91a1a}-0.49\%$
test_serialize_weights_returnearly 0.3358s 53.8080ms 18.5846 Ops/s 23.2255 Ops/s $\textbf{\color{#d91a1a}-19.98\%}$
test_serialize_weights_pickle 1.3881s 1.2211s 0.8189 Ops/s 0.8188 Ops/s $\color{#35bf28}+0.01\%$
test_reshape_pytree 66.2710μs 21.9634μs 45.5304 KOps/s 44.9075 KOps/s $\color{#35bf28}+1.39\%$
test_reshape_td 65.3920μs 26.6172μs 37.5697 KOps/s 36.9911 KOps/s $\color{#35bf28}+1.56\%$
test_view_pytree 54.0510μs 21.8391μs 45.7894 KOps/s 45.2590 KOps/s $\color{#35bf28}+1.17\%$
test_view_td 70.1210μs 32.0206μs 31.2299 KOps/s 31.7138 KOps/s $\color{#d91a1a}-1.53\%$
test_unbind_pytree 75.1620μs 28.8606μs 34.6493 KOps/s 34.4112 KOps/s $\color{#35bf28}+0.69\%$
test_unbind_td 0.7805ms 37.8676μs 26.4078 KOps/s 26.3323 KOps/s $\color{#35bf28}+0.29\%$
test_split_pytree 63.5810μs 29.4403μs 33.9670 KOps/s 32.7729 KOps/s $\color{#35bf28}+3.64\%$
test_split_td 0.9251ms 40.1517μs 24.9056 KOps/s 24.9784 KOps/s $\color{#d91a1a}-0.29\%$
test_add_pytree 0.1227ms 36.0347μs 27.7510 KOps/s 27.6656 KOps/s $\color{#35bf28}+0.31\%$
test_add_td 90.9820μs 47.7264μs 20.9528 KOps/s 19.6334 KOps/s $\textbf{\color{#35bf28}+6.72\%}$
test_compile_add_one_nested[tensordict-compile] 0.1763ms 0.1238ms 8.0792 KOps/s 7.6959 KOps/s $\color{#35bf28}+4.98\%$
test_compile_add_one_nested[tensordict-eager] 0.2268ms 0.1338ms 7.4758 KOps/s 7.4623 KOps/s $\color{#35bf28}+0.18\%$
test_compile_add_one_nested[pytree-compile] 0.2014ms 97.1515μs 10.2932 KOps/s 10.2413 KOps/s $\color{#35bf28}+0.51\%$
test_compile_add_one_nested[pytree-eager] 1.3573ms 0.1523ms 6.5640 KOps/s 6.6637 KOps/s $\color{#d91a1a}-1.50\%$
test_compile_copy_nested[tensordict-compile] 64.2610μs 25.7476μs 38.8386 KOps/s 36.1531 KOps/s $\textbf{\color{#35bf28}+7.43\%}$
test_compile_copy_nested[tensordict-eager] 62.8220μs 29.1327μs 34.3256 KOps/s 34.3197 KOps/s $\color{#35bf28}+0.02\%$
test_compile_copy_nested[pytree-compile] 0.4254ms 64.6143μs 15.4765 KOps/s 15.4526 KOps/s $\color{#35bf28}+0.15\%$
test_compile_copy_nested[pytree-eager] 0.1203ms 48.9423μs 20.4322 KOps/s 20.1198 KOps/s $\color{#35bf28}+1.55\%$
test_compile_add_one_flat[tensordict-compile] 0.1909ms 0.1490ms 6.7110 KOps/s 6.9606 KOps/s $\color{#d91a1a}-3.58\%$
test_compile_add_one_flat[tensordict-eager] 0.3345ms 0.2200ms 4.5462 KOps/s 4.5734 KOps/s $\color{#d91a1a}-0.60\%$
test_compile_add_one_flat[tensorclass-compile] 0.1460ms 98.8178μs 10.1196 KOps/s 9.9978 KOps/s $\color{#35bf28}+1.22\%$
test_compile_add_one_flat[tensorclass-eager] 0.1121ms 55.0241μs 18.1739 KOps/s 17.6660 KOps/s $\color{#35bf28}+2.87\%$
test_compile_add_one_flat[pytree-compile] 0.2498ms 0.1377ms 7.2614 KOps/s 7.1908 KOps/s $\color{#35bf28}+0.98\%$
test_compile_add_one_flat[pytree-eager] 0.5507ms 0.4932ms 2.0277 KOps/s 2.0858 KOps/s $\color{#d91a1a}-2.78\%$
test_compile_add_self_flat[tensordict-eager] 0.4105ms 0.2696ms 3.7086 KOps/s 3.7893 KOps/s $\color{#d91a1a}-2.13\%$
test_compile_add_self_flat[tensordict-compile] 0.2067ms 0.1523ms 6.5649 KOps/s 6.9695 KOps/s $\textbf{\color{#d91a1a}-5.81\%}$
test_compile_add_self_flat[tensorclass-eager] 0.1706ms 69.9526μs 14.2954 KOps/s 13.9427 KOps/s $\color{#35bf28}+2.53\%$
test_compile_add_self_flat[tensorclass-compile] 0.2329ms 0.1029ms 9.7205 KOps/s 10.0610 KOps/s $\color{#d91a1a}-3.38\%$
test_compile_add_self_flat[pytree-eager] 0.5525ms 0.4224ms 2.3673 KOps/s 2.4678 KOps/s $\color{#d91a1a}-4.07\%$
test_compile_add_self_flat[pytree-compile] 0.2191ms 0.1426ms 7.0123 KOps/s 7.2865 KOps/s $\color{#d91a1a}-3.76\%$
test_compile_copy_flat[tensordict-compile] 0.1184ms 19.5879μs 51.0520 KOps/s 52.4616 KOps/s $\color{#d91a1a}-2.69\%$
test_compile_copy_flat[tensordict-eager] 0.1224ms 31.8128μs 31.4338 KOps/s 31.5588 KOps/s $\color{#d91a1a}-0.40\%$
test_compile_copy_flat[pytree-compile] 0.2057ms 70.0114μs 14.2834 KOps/s 14.2812 KOps/s $\color{#35bf28}+0.02\%$
test_compile_copy_flat[pytree-eager] 0.1594ms 51.9882μs 19.2351 KOps/s 19.0330 KOps/s $\color{#35bf28}+1.06\%$
test_compile_assign_and_add[tensordict-compile] 1.6148ms 0.3897ms 2.5662 KOps/s 2.2266 KOps/s $\textbf{\color{#35bf28}+15.25\%}$
test_compile_assign_and_add[tensordict-eager] 3.1067ms 2.7187ms 367.8164 Ops/s 373.5853 Ops/s $\color{#d91a1a}-1.54\%$
test_compile_assign_and_add[pytree-compile] 1.5885ms 0.4366ms 2.2902 KOps/s 2.2854 KOps/s $\color{#35bf28}+0.21\%$
test_compile_assign_and_add[pytree-eager] 2.9587ms 2.8179ms 354.8720 Ops/s 379.5435 Ops/s $\textbf{\color{#d91a1a}-6.50\%}$
test_compile_indexing[tensor-tensordict-compile] 0.5172ms 0.1183ms 8.4538 KOps/s 8.4052 KOps/s $\color{#35bf28}+0.58\%$
test_compile_indexing[tensor-tensordict-eager] 0.5817ms 80.1653μs 12.4742 KOps/s 11.8649 KOps/s $\textbf{\color{#35bf28}+5.14\%}$
test_compile_indexing[tensor-tensorclass-compile] 0.6739ms 0.1109ms 9.0170 KOps/s 8.9853 KOps/s $\color{#35bf28}+0.35\%$
test_compile_indexing[tensor-tensorclass-eager] 0.1235ms 74.1089μs 13.4937 KOps/s 13.7949 KOps/s $\color{#d91a1a}-2.18\%$
test_compile_indexing[tensor-pytree-compile] 0.2334ms 0.1126ms 8.8788 KOps/s 9.3671 KOps/s $\textbf{\color{#d91a1a}-5.21\%}$
test_compile_indexing[tensor-pytree-eager] 0.1304ms 72.6901μs 13.7570 KOps/s 14.3702 KOps/s $\color{#d91a1a}-4.27\%$
test_compile_indexing[slice-tensordict-compile] 0.1641ms 0.1025ms 9.7574 KOps/s 9.8996 KOps/s $\color{#d91a1a}-1.44\%$
test_compile_indexing[slice-tensordict-eager] 0.1421ms 18.3233μs 54.5753 KOps/s 57.3264 KOps/s $\color{#d91a1a}-4.80\%$
test_compile_indexing[slice-tensorclass-compile] 0.1473ms 97.7321μs 10.2320 KOps/s 10.2676 KOps/s $\color{#d91a1a}-0.35\%$
test_compile_indexing[slice-tensorclass-eager] 48.7310μs 17.1538μs 58.2960 KOps/s 58.8353 KOps/s $\color{#d91a1a}-0.92\%$
test_compile_indexing[slice-pytree-compile] 0.1370ms 96.5605μs 10.3562 KOps/s 10.2001 KOps/s $\color{#35bf28}+1.53\%$
test_compile_indexing[slice-pytree-eager] 52.4410μs 16.0795μs 62.1912 KOps/s 62.3874 KOps/s $\color{#d91a1a}-0.31\%$
test_compile_indexing[int-tensordict-compile] 0.1557ms 0.1062ms 9.4137 KOps/s 9.2498 KOps/s $\color{#35bf28}+1.77\%$
test_compile_indexing[int-tensordict-eager] 0.5544ms 17.3702μs 57.5699 KOps/s 56.7432 KOps/s $\color{#35bf28}+1.46\%$
test_compile_indexing[int-tensorclass-compile] 0.1719ms 99.3987μs 10.0605 KOps/s 10.2162 KOps/s $\color{#d91a1a}-1.52\%$
test_compile_indexing[int-tensorclass-eager] 52.6710μs 17.2276μs 58.0466 KOps/s 62.5678 KOps/s $\textbf{\color{#d91a1a}-7.23\%}$
test_compile_indexing[int-pytree-compile] 0.2050ms 96.5931μs 10.3527 KOps/s 10.1721 KOps/s $\color{#35bf28}+1.78\%$
test_compile_indexing[int-pytree-eager] 53.0710μs 16.0894μs 62.1527 KOps/s 62.7430 KOps/s $\color{#d91a1a}-0.94\%$
test_mod_add[eager] 74.8920μs 37.9080μs 26.3797 KOps/s 25.0197 KOps/s $\textbf{\color{#35bf28}+5.44\%}$
test_mod_add[compile] 0.3844ms 81.2656μs 12.3053 KOps/s 12.0955 KOps/s $\color{#35bf28}+1.74\%$
test_mod_add[compile-overhead] 0.3272ms 0.1692ms 5.9094 KOps/s 5.6476 KOps/s $\color{#35bf28}+4.64\%$
test_mod_wrap[eager] 0.3568ms 0.2524ms 3.9622 KOps/s 3.8070 KOps/s $\color{#35bf28}+4.08\%$
test_mod_wrap[compile] 0.4221ms 0.3008ms 3.3244 KOps/s 3.4163 KOps/s $\color{#d91a1a}-2.69\%$
test_mod_wrap[compile-overhead] 7.4274ms 3.8560ms 259.3381 Ops/s 262.7778 Ops/s $\color{#d91a1a}-1.31\%$
test_mod_wrap_and_backward[eager] 1.6029ms 1.5141ms 660.4627 Ops/s 676.8315 Ops/s $\color{#d91a1a}-2.42\%$
test_mod_wrap_and_backward[compile] 1.5246ms 1.4274ms 700.5638 Ops/s 712.8089 Ops/s $\color{#d91a1a}-1.72\%$
test_mod_wrap_and_backward[compile-overhead] 1.4755ms 1.0047ms 995.3573 Ops/s 968.5917 Ops/s $\color{#35bf28}+2.76\%$
test_seq_add[eager] 0.1630ms 0.1164ms 8.5924 KOps/s 8.1408 KOps/s $\textbf{\color{#35bf28}+5.55\%}$
test_seq_add[compile] 0.1346ms 88.8012μs 11.2611 KOps/s 10.5511 KOps/s $\textbf{\color{#35bf28}+6.73\%}$
test_seq_add[compile-overhead] 0.1828ms 0.1299ms 7.6988 KOps/s 7.5934 KOps/s $\color{#35bf28}+1.39\%$
test_seq_wrap[eager] 0.4869ms 0.4237ms 2.3600 KOps/s 2.2661 KOps/s $\color{#35bf28}+4.14\%$
test_seq_wrap[compile] 0.3759ms 0.3034ms 3.2954 KOps/s 3.2382 KOps/s $\color{#35bf28}+1.77\%$
test_seq_wrap[compile-overhead] 0.2838ms 0.2239ms 4.4661 KOps/s 4.3696 KOps/s $\color{#35bf28}+2.21\%$
test_func_call_runtime[False-eager] 0.8024ms 0.7514ms 1.3308 KOps/s 1.3277 KOps/s $\color{#35bf28}+0.23\%$
test_func_call_runtime[False-compile] 0.8130ms 0.7437ms 1.3446 KOps/s 1.3200 KOps/s $\color{#35bf28}+1.86\%$
test_func_call_runtime[False-compile-overhead] 0.4129ms 0.3649ms 2.7407 KOps/s 2.7162 KOps/s $\color{#35bf28}+0.90\%$
test_func_call_runtime[True-eager] 0.9768ms 0.9178ms 1.0896 KOps/s 1.0906 KOps/s $\color{#d91a1a}-0.10\%$
test_func_call_runtime[True-compile] 0.8659ms 0.7800ms 1.2821 KOps/s 1.2865 KOps/s $\color{#d91a1a}-0.34\%$
test_func_call_runtime[True-compile-overhead] 0.5250ms 0.3844ms 2.6017 KOps/s 2.5845 KOps/s $\color{#35bf28}+0.66\%$
test_func_call_cm_runtime[False-eager] 0.8461ms 0.7454ms 1.3415 KOps/s 1.3449 KOps/s $\color{#d91a1a}-0.25\%$
test_func_call_cm_runtime[False-compile] 0.8005ms 0.7565ms 1.3220 KOps/s 1.3148 KOps/s $\color{#35bf28}+0.55\%$
test_func_call_cm_runtime[False-compile-overhead] 0.4386ms 0.3676ms 2.7203 KOps/s 2.7222 KOps/s $\color{#d91a1a}-0.07\%$
test_func_call_cm_runtime[True-eager] 1.1157ms 1.0136ms 986.5453 Ops/s 973.2682 Ops/s $\color{#35bf28}+1.36\%$
test_func_call_cm_runtime[True-compile] 1.1016ms 1.0050ms 995.0644 Ops/s 995.0395 Ops/s $+0.00\%$
test_func_call_cm_runtime[True-compile-overhead] 1.0768ms 1.0033ms 996.7254 Ops/s 988.2776 Ops/s $\color{#35bf28}+0.85\%$
test_vmap_func_call_cm_runtime[eager] 2.5282ms 2.1203ms 471.6352 Ops/s 473.4547 Ops/s $\color{#d91a1a}-0.38\%$
test_vmap_func_call_cm_runtime[compile] 0.8735ms 0.8161ms 1.2253 KOps/s 1.2093 KOps/s $\color{#35bf28}+1.33\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.4547ms 0.4146ms 2.4120 KOps/s 2.3748 KOps/s $\color{#35bf28}+1.57\%$
test_distributed 3.1668ms 0.1867ms 5.3556 KOps/s 8.3825 KOps/s $\textbf{\color{#d91a1a}-36.11\%}$
test_tdmodule 0.3969ms 20.1480μs 49.6328 KOps/s 46.3040 KOps/s $\textbf{\color{#35bf28}+7.19\%}$
test_tdmodule_dispatch 56.0910μs 34.6251μs 28.8807 KOps/s 26.2127 KOps/s $\textbf{\color{#35bf28}+10.18\%}$
test_tdseq 27.8110μs 19.9936μs 50.0161 KOps/s 46.2918 KOps/s $\textbf{\color{#35bf28}+8.05\%}$
test_tdseq_dispatch 67.4110μs 38.0974μs 26.2485 KOps/s 24.8509 KOps/s $\textbf{\color{#35bf28}+5.62\%}$
test_instantiation_functorch 1.7202ms 1.6266ms 614.7885 Ops/s 637.2970 Ops/s $\color{#d91a1a}-3.53\%$
test_exec_functorch 0.1953ms 0.1472ms 6.7920 KOps/s 6.8060 KOps/s $\color{#d91a1a}-0.21\%$
test_exec_functional_call 0.1827ms 0.1402ms 7.1305 KOps/s 7.0663 KOps/s $\color{#35bf28}+0.91\%$
test_exec_td_decorator 0.3752ms 0.1912ms 5.2293 KOps/s 5.2085 KOps/s $\color{#35bf28}+0.40\%$
test_vmap_mlp_speed_decorator[True-True] 0.8430ms 0.7056ms 1.4173 KOps/s 1.4469 KOps/s $\color{#d91a1a}-2.05\%$
test_vmap_mlp_speed_decorator[True-False] 0.8094ms 0.6852ms 1.4594 KOps/s 1.4512 KOps/s $\color{#35bf28}+0.57\%$
test_vmap_mlp_speed_decorator[False-True] 0.7161ms 0.6005ms 1.6652 KOps/s 1.6765 KOps/s $\color{#d91a1a}-0.68\%$
test_vmap_mlp_speed_decorator[False-False] 0.7369ms 0.6078ms 1.6453 KOps/s 1.6748 KOps/s $\color{#d91a1a}-1.77\%$
test_vmap_transformer_speed_decorator[True-True] 19.5983ms 19.4726ms 51.3541 Ops/s 52.0957 Ops/s $\color{#d91a1a}-1.42\%$
test_vmap_transformer_speed_decorator[True-False] 19.9331ms 19.4627ms 51.3804 Ops/s 52.0437 Ops/s $\color{#d91a1a}-1.27\%$
test_vmap_transformer_speed_decorator[False-True] 20.1067ms 19.3688ms 51.6293 Ops/s 52.4486 Ops/s $\color{#d91a1a}-1.56\%$
test_vmap_transformer_speed_decorator[False-False] 19.4235ms 19.3152ms 51.7726 Ops/s 52.5416 Ops/s $\color{#d91a1a}-1.46\%$
test_to_module_speed[True] 1.4507ms 0.9747ms 1.0260 KOps/s 1.0286 KOps/s $\color{#d91a1a}-0.26\%$
test_to_module_speed[False] 1.1046ms 0.9493ms 1.0534 KOps/s 1.0529 KOps/s $\color{#35bf28}+0.05\%$
test_tc_init 63.7210μs 34.4924μs 28.9919 KOps/s 28.6459 KOps/s $\color{#35bf28}+1.21\%$
test_tc_init_nested 0.1149ms 69.8295μs 14.3206 KOps/s 13.8096 KOps/s $\color{#35bf28}+3.70\%$
test_tc_first_layer_tensor 19.1010μs 0.8033μs 1.2448 MOps/s 1.4283 MOps/s $\textbf{\color{#d91a1a}-12.84\%}$
test_tc_first_layer_nontensor 24.6610μs 2.2056μs 453.3862 KOps/s 448.3332 KOps/s $\color{#35bf28}+1.13\%$
test_tc_second_layer_tensor 26.3310μs 1.4999μs 666.7122 KOps/s 699.2673 KOps/s $\color{#d91a1a}-4.66\%$
test_tc_second_layer_nontensor 24.1210μs 2.9339μs 340.8459 KOps/s 333.4364 KOps/s $\color{#35bf28}+2.22\%$
test_unbind 7.2074ms 6.9710ms 143.4506 Ops/s 142.4282 Ops/s $\color{#35bf28}+0.72\%$
test_full_like 12.1539ms 9.3104ms 107.4070 Ops/s 107.7254 Ops/s $\color{#d91a1a}-0.30\%$
test_zeros_like 5.9012ms 4.2578ms 234.8613 Ops/s 230.8062 Ops/s $\color{#35bf28}+1.76\%$
test_ones_like 4.4968ms 4.3213ms 231.4142 Ops/s 230.6611 Ops/s $\color{#35bf28}+0.33\%$
test_clone 11.3713ms 9.0947ms 109.9539 Ops/s 156.4913 Ops/s $\textbf{\color{#d91a1a}-29.74\%}$
test_squeeze 47.1410μs 9.7577μs 102.4835 KOps/s 103.0415 KOps/s $\color{#d91a1a}-0.54\%$
test_unsqueeze 0.1137ms 73.9509μs 13.5225 KOps/s 13.1322 KOps/s $\color{#35bf28}+2.97\%$
test_split 0.2099s 0.2211ms 4.5222 KOps/s 6.0173 KOps/s $\textbf{\color{#d91a1a}-24.85\%}$
test_permute 0.2338ms 0.1855ms 5.3915 KOps/s 5.2793 KOps/s $\color{#35bf28}+2.13\%$
test_stack 51.7408ms 50.2445ms 19.9027 Ops/s 19.8132 Ops/s $\color{#35bf28}+0.45\%$
test_cat 50.6412ms 49.8984ms 20.0407 Ops/s 19.8848 Ops/s $\color{#35bf28}+0.78\%$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants