Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] as_tensordict_module #1251

Merged
merged 1 commit into from
Mar 5, 2025
Merged

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Mar 5, 2025

Stack from ghstack (oldest at bottom):

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Mar 5, 2025
ghstack-source-id: 403a5fe0e4d71b40865c6502c2df59927da879ec
Pull Request resolved: #1251
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 5, 2025
@vmoens
Copy link
Contributor Author

vmoens commented Mar 5, 2025

cc @Darktex

@vmoens vmoens added the enhancement New feature or request label Mar 5, 2025
@vmoens vmoens merged commit f50702f into gh/vmoens/48/base Mar 5, 2025
34 checks passed
vmoens added a commit that referenced this pull request Mar 5, 2025
ghstack-source-id: 403a5fe0e4d71b40865c6502c2df59927da879ec
Pull Request resolved: #1251
@vmoens vmoens deleted the gh/vmoens/48/head branch March 5, 2025 05:15
Copy link

github-actions bot commented Mar 5, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 217. Improved: $\large\color{#35bf28}6$. Worsened: $\large\color{#d91a1a}10$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 42.3790μs 20.8198μs 48.0312 KOps/s 49.0613 KOps/s $\color{#d91a1a}-2.10\%$
test_plain_set_stack_nested 48.6210μs 20.9855μs 47.6519 KOps/s 48.3971 KOps/s $\color{#d91a1a}-1.54\%$
test_plain_set_nested_inplace 62.5580μs 22.5795μs 44.2879 KOps/s 44.9180 KOps/s $\color{#d91a1a}-1.40\%$
test_plain_set_stack_nested_inplace 64.0210μs 22.3721μs 44.6986 KOps/s 45.0439 KOps/s $\color{#d91a1a}-0.77\%$
test_items 19.2860μs 4.1810μs 239.1749 KOps/s 240.3092 KOps/s $\color{#d91a1a}-0.47\%$
test_items_nested 0.7219ms 0.4096ms 2.4416 KOps/s 2.4294 KOps/s $\color{#35bf28}+0.51\%$
test_items_nested_locked 0.5274ms 0.4056ms 2.4654 KOps/s 2.4312 KOps/s $\color{#35bf28}+1.41\%$
test_items_nested_leaf 0.1497ms 76.4670μs 13.0775 KOps/s 12.8728 KOps/s $\color{#35bf28}+1.59\%$
test_items_stack_nested 0.8833ms 0.4070ms 2.4567 KOps/s 2.4157 KOps/s $\color{#35bf28}+1.70\%$
test_items_stack_nested_leaf 0.1573ms 76.7183μs 13.0347 KOps/s 12.8048 KOps/s $\color{#35bf28}+1.80\%$
test_items_stack_nested_locked 0.6805ms 0.4069ms 2.4574 KOps/s 2.4170 KOps/s $\color{#35bf28}+1.67\%$
test_keys 26.5400μs 3.4329μs 291.2965 KOps/s 278.4061 KOps/s $\color{#35bf28}+4.63\%$
test_keys_nested 0.2676ms 0.1676ms 5.9656 KOps/s 6.0577 KOps/s $\color{#d91a1a}-1.52\%$
test_keys_nested_locked 1.8287ms 0.1728ms 5.7866 KOps/s 5.8202 KOps/s $\color{#d91a1a}-0.58\%$
test_keys_nested_leaf 0.2423ms 0.1461ms 6.8429 KOps/s 6.9242 KOps/s $\color{#d91a1a}-1.17\%$
test_keys_stack_nested 0.3175ms 0.1686ms 5.9313 KOps/s 6.0386 KOps/s $\color{#d91a1a}-1.78\%$
test_keys_stack_nested_leaf 0.2378ms 0.1464ms 6.8319 KOps/s 6.9013 KOps/s $\color{#d91a1a}-1.01\%$
test_keys_stack_nested_locked 0.2947ms 0.1728ms 5.7855 KOps/s 5.8401 KOps/s $\color{#d91a1a}-0.93\%$
test_values 8.3342μs 1.0604μs 943.0625 KOps/s 968.2519 KOps/s $\color{#d91a1a}-2.60\%$
test_values_nested 0.1137ms 63.8775μs 15.6550 KOps/s 16.0240 KOps/s $\color{#d91a1a}-2.30\%$
test_values_nested_locked 0.1203ms 63.7184μs 15.6941 KOps/s 15.5996 KOps/s $\color{#35bf28}+0.61\%$
test_values_nested_leaf 0.1280ms 73.2681μs 13.6485 KOps/s 13.8740 KOps/s $\color{#d91a1a}-1.63\%$
test_values_stack_nested 0.1447ms 63.4918μs 15.7501 KOps/s 16.0635 KOps/s $\color{#d91a1a}-1.95\%$
test_values_stack_nested_leaf 0.1157ms 72.8718μs 13.7227 KOps/s 13.8373 KOps/s $\color{#d91a1a}-0.83\%$
test_values_stack_nested_locked 0.1123ms 63.4344μs 15.7643 KOps/s 16.0678 KOps/s $\color{#d91a1a}-1.89\%$
test_membership 5.7380μs 0.7103μs 1.4079 MOps/s 1.3835 MOps/s $\color{#35bf28}+1.76\%$
test_membership_nested 43.2110μs 2.9123μs 343.3677 KOps/s 338.0551 KOps/s $\color{#35bf28}+1.57\%$
test_membership_nested_leaf 29.7960μs 2.9144μs 343.1272 KOps/s 344.4264 KOps/s $\color{#d91a1a}-0.38\%$
test_membership_stacked_nested 33.0920μs 2.9194μs 342.5393 KOps/s 345.3221 KOps/s $\color{#d91a1a}-0.81\%$
test_membership_stacked_nested_leaf 23.8250μs 2.8684μs 348.6205 KOps/s 345.4461 KOps/s $\color{#35bf28}+0.92\%$
test_membership_nested_last 45.0450μs 4.3025μs 232.4221 KOps/s 225.3442 KOps/s $\color{#35bf28}+3.14\%$
test_membership_nested_leaf_last 30.0460μs 4.3007μs 232.5184 KOps/s 227.0469 KOps/s $\color{#35bf28}+2.41\%$
test_membership_stacked_nested_last 22.7530μs 4.3316μs 230.8608 KOps/s 230.9583 KOps/s $\color{#d91a1a}-0.04\%$
test_membership_stacked_nested_leaf_last 43.0810μs 4.2868μs 233.2769 KOps/s 233.0552 KOps/s $\color{#35bf28}+0.10\%$
test_nested_getleaf 44.0130μs 10.5575μs 94.7197 KOps/s 93.2355 KOps/s $\color{#35bf28}+1.59\%$
test_nested_get 38.0420μs 10.0399μs 99.6027 KOps/s 98.8938 KOps/s $\color{#35bf28}+0.72\%$
test_stacked_getleaf 67.0360μs 10.4325μs 95.8544 KOps/s 93.0505 KOps/s $\color{#35bf28}+3.01\%$
test_stacked_get 36.2580μs 10.0420μs 99.5817 KOps/s 97.2122 KOps/s $\color{#35bf28}+2.44\%$
test_nested_getitemleaf 30.3470μs 11.1560μs 89.6379 KOps/s 88.0993 KOps/s $\color{#35bf28}+1.75\%$
test_nested_getitem 55.3640μs 10.7687μs 92.8615 KOps/s 93.1746 KOps/s $\color{#d91a1a}-0.34\%$
test_stacked_getitemleaf 56.9470μs 11.0563μs 90.4459 KOps/s 87.4369 KOps/s $\color{#35bf28}+3.44\%$
test_stacked_getitem 43.6020μs 10.5733μs 94.5774 KOps/s 92.5563 KOps/s $\color{#35bf28}+2.18\%$
test_lock_nested 0.5029ms 0.4189ms 2.3870 KOps/s 2.4282 KOps/s $\color{#d91a1a}-1.70\%$
test_lock_stack_nested 0.5393ms 0.4254ms 2.3506 KOps/s 2.3346 KOps/s $\color{#35bf28}+0.68\%$
test_unlock_nested 0.6453ms 0.3433ms 2.9133 KOps/s 2.9626 KOps/s $\color{#d91a1a}-1.66\%$
test_unlock_stack_nested 0.7599ms 0.3479ms 2.8744 KOps/s 2.9072 KOps/s $\color{#d91a1a}-1.13\%$
test_flatten_speed 0.2024ms 99.8813μs 10.0119 KOps/s 9.8281 KOps/s $\color{#35bf28}+1.87\%$
test_unflatten_speed 0.7216ms 0.5217ms 1.9169 KOps/s 1.9119 KOps/s $\color{#35bf28}+0.26\%$
test_common_ops 4.5445ms 0.8168ms 1.2243 KOps/s 1.2305 KOps/s $\color{#d91a1a}-0.50\%$
test_creation 38.3430μs 2.5100μs 398.4093 KOps/s 398.9303 KOps/s $\color{#d91a1a}-0.13\%$
test_creation_empty 73.5680μs 11.6438μs 85.8828 KOps/s 86.7272 KOps/s $\color{#d91a1a}-0.97\%$
test_creation_nested_1 52.0890μs 14.4942μs 68.9930 KOps/s 68.4961 KOps/s $\color{#35bf28}+0.73\%$
test_creation_nested_2 54.1310μs 19.1193μs 52.3031 KOps/s 52.6584 KOps/s $\color{#d91a1a}-0.67\%$
test_clone 60.6130μs 13.4954μs 74.0994 KOps/s 74.3512 KOps/s $\color{#d91a1a}-0.34\%$
test_getitem[int] 0.9517ms 12.8783μs 77.6497 KOps/s 79.0296 KOps/s $\color{#d91a1a}-1.75\%$
test_getitem[slice_int] 0.1353ms 24.8564μs 40.2311 KOps/s 40.7985 KOps/s $\color{#d91a1a}-1.39\%$
test_getitem[range] 0.1607ms 51.6307μs 19.3683 KOps/s 19.4012 KOps/s $\color{#d91a1a}-0.17\%$
test_getitem[tuple] 0.1276ms 20.2473μs 49.3893 KOps/s 50.3430 KOps/s $\color{#d91a1a}-1.89\%$
test_getitem[list] 0.1601ms 46.6489μs 21.4367 KOps/s 21.5288 KOps/s $\color{#d91a1a}-0.43\%$
test_setitem_dim[int] 59.3710μs 26.1430μs 38.2512 KOps/s 39.1798 KOps/s $\color{#d91a1a}-2.37\%$
test_setitem_dim[slice_int] 83.5470μs 51.2977μs 19.4940 KOps/s 20.0348 KOps/s $\color{#d91a1a}-2.70\%$
test_setitem_dim[range] 0.1733ms 77.4545μs 12.9108 KOps/s 13.0424 KOps/s $\color{#d91a1a}-1.01\%$
test_setitem_dim[tuple] 75.0710μs 41.3883μs 24.1614 KOps/s 24.6102 KOps/s $\color{#d91a1a}-1.82\%$
test_setitem 79.4290μs 20.7480μs 48.1975 KOps/s 49.4758 KOps/s $\color{#d91a1a}-2.58\%$
test_set 65.0220μs 20.1018μs 49.7469 KOps/s 50.4027 KOps/s $\color{#d91a1a}-1.30\%$
test_set_shared 5.7429ms 0.1842ms 5.4295 KOps/s 5.3138 KOps/s $\color{#35bf28}+2.18\%$
test_update 0.1166ms 26.5332μs 37.6887 KOps/s 38.0456 KOps/s $\color{#d91a1a}-0.94\%$
test_update_nested 0.1055ms 42.0214μs 23.7974 KOps/s 24.0669 KOps/s $\color{#d91a1a}-1.12\%$
test_update__nested 0.4675ms 34.1447μs 29.2872 KOps/s 28.9518 KOps/s $\color{#35bf28}+1.16\%$
test_set_nested 72.6370μs 22.2328μs 44.9786 KOps/s 45.7970 KOps/s $\color{#d91a1a}-1.79\%$
test_set_nested_new 66.2550μs 26.8929μs 37.1846 KOps/s 37.7412 KOps/s $\color{#d91a1a}-1.47\%$
test_select 0.1123ms 43.6941μs 22.8864 KOps/s 23.5531 KOps/s $\color{#d91a1a}-2.83\%$
test_select_nested 0.1333ms 62.8998μs 15.8983 KOps/s 16.0267 KOps/s $\color{#d91a1a}-0.80\%$
test_exclude_nested 0.1682ms 80.0823μs 12.4872 KOps/s 12.3464 KOps/s $\color{#35bf28}+1.14\%$
test_empty[True] 0.5420ms 0.4065ms 2.4602 KOps/s 2.4285 KOps/s $\color{#35bf28}+1.31\%$
test_empty[False] 10.9305μs 1.3799μs 724.6938 KOps/s 734.9685 KOps/s $\color{#d91a1a}-1.40\%$
test_unbind_speed 0.4278ms 0.2726ms 3.6687 KOps/s 3.6734 KOps/s $\color{#d91a1a}-0.13\%$
test_unbind_speed_stack0 0.3957ms 0.2703ms 3.6990 KOps/s 3.7476 KOps/s $\color{#d91a1a}-1.30\%$
test_unbind_speed_stack1 97.8840ms 0.7379ms 1.3552 KOps/s 1.2454 KOps/s $\textbf{\color{#35bf28}+8.82\%}$
test_split 0.1037s 1.7638ms 566.9441 Ops/s 576.1274 Ops/s $\color{#d91a1a}-1.59\%$
test_chunk 0.1044s 1.7629ms 567.2404 Ops/s 632.1457 Ops/s $\textbf{\color{#d91a1a}-10.27\%}$
test_consolidate_njt[False-None] 11.4246ms 8.1997ms 121.9555 Ops/s 121.4303 Ops/s $\color{#35bf28}+0.43\%$
test_creation[device0] 0.2318ms 90.5973μs 11.0379 KOps/s 10.9268 KOps/s $\color{#35bf28}+1.02\%$
test_creation_from_tensor 4.4688ms 93.6483μs 10.6783 KOps/s 10.4455 KOps/s $\color{#35bf28}+2.23\%$
test_add_one[memmap_tensor0] 73.1470μs 4.8385μs 206.6760 KOps/s 202.9703 KOps/s $\color{#35bf28}+1.83\%$
test_contiguous[memmap_tensor0] 15.1080μs 0.5094μs 1.9630 MOps/s 1.9321 MOps/s $\color{#35bf28}+1.60\%$
test_stack[memmap_tensor0] 20.4980μs 3.3646μs 297.2117 KOps/s 299.0890 KOps/s $\color{#d91a1a}-0.63\%$
test_memmaptd_index 1.1375ms 0.2267ms 4.4104 KOps/s 4.3073 KOps/s $\color{#35bf28}+2.39\%$
test_memmaptd_index_astensor 0.4607ms 0.3131ms 3.1935 KOps/s 3.1173 KOps/s $\color{#35bf28}+2.45\%$
test_memmaptd_index_op 1.0852ms 0.5742ms 1.7415 KOps/s 1.6996 KOps/s $\color{#35bf28}+2.46\%$
test_serialize_model 0.2120s 0.1297s 7.7091 Ops/s 8.2569 Ops/s $\textbf{\color{#d91a1a}-6.63\%}$
test_serialize_model_pickle 0.4616s 0.3942s 2.5368 Ops/s 2.4811 Ops/s $\color{#35bf28}+2.25\%$
test_serialize_weights 0.1186s 0.1114s 8.9792 Ops/s 8.6337 Ops/s $\color{#35bf28}+4.00\%$
test_serialize_weights_returnearly 0.1839s 0.1625s 6.1549 Ops/s 6.5698 Ops/s $\textbf{\color{#d91a1a}-6.32\%}$
test_serialize_weights_pickle 0.5303s 0.4724s 2.1167 Ops/s 1.1032 Ops/s $\textbf{\color{#35bf28}+91.87\%}$
test_serialize_weights_filesystem 0.1509s 0.1416s 7.0617 Ops/s 6.3516 Ops/s $\textbf{\color{#35bf28}+11.18\%}$
test_serialize_model_filesystem 0.2478s 0.1636s 6.1121 Ops/s 6.9891 Ops/s $\textbf{\color{#d91a1a}-12.55\%}$
test_reshape_pytree 63.9110μs 27.4025μs 36.4930 KOps/s 37.6231 KOps/s $\color{#d91a1a}-3.00\%$
test_reshape_td 73.5680μs 33.6546μs 29.7137 KOps/s 29.3684 KOps/s $\color{#35bf28}+1.18\%$
test_view_pytree 59.2010μs 27.0343μs 36.9900 KOps/s 37.8824 KOps/s $\color{#d91a1a}-2.36\%$
test_view_td 84.1980μs 41.0428μs 24.3648 KOps/s 24.5428 KOps/s $\color{#d91a1a}-0.73\%$
test_unbind_pytree 78.4980μs 29.7228μs 33.6442 KOps/s 33.5966 KOps/s $\color{#35bf28}+0.14\%$
test_unbind_td 0.3597ms 40.3738μs 24.7685 KOps/s 25.1517 KOps/s $\color{#d91a1a}-1.52\%$
test_split_pytree 67.0560μs 29.5418μs 33.8503 KOps/s 34.4191 KOps/s $\color{#d91a1a}-1.65\%$
test_split_td 0.2015ms 45.8814μs 21.7953 KOps/s 21.7775 KOps/s $\color{#35bf28}+0.08\%$
test_add_pytree 96.0300μs 37.3050μs 26.8061 KOps/s 27.8433 KOps/s $\color{#d91a1a}-3.73\%$
test_add_td 0.2739ms 58.6709μs 17.0442 KOps/s 17.5874 KOps/s $\color{#d91a1a}-3.09\%$
test_compile_add_one_nested[tensordict-compile] 0.1530ms 69.9746μs 14.2909 KOps/s 14.2598 KOps/s $\color{#35bf28}+0.22\%$
test_compile_add_one_nested[tensordict-eager] 0.3826ms 0.1750ms 5.7148 KOps/s 5.7594 KOps/s $\color{#d91a1a}-0.78\%$
test_compile_add_one_nested[pytree-compile] 0.1213ms 46.1720μs 21.6581 KOps/s 21.7780 KOps/s $\color{#d91a1a}-0.55\%$
test_compile_add_one_nested[pytree-eager] 0.2983ms 0.1217ms 8.2185 KOps/s 8.3065 KOps/s $\color{#d91a1a}-1.06\%$
test_compile_copy_nested[tensordict-compile] 70.7630μs 29.0747μs 34.3941 KOps/s 35.4868 KOps/s $\color{#d91a1a}-3.08\%$
test_compile_copy_nested[tensordict-eager] 0.1138ms 59.3591μs 16.8466 KOps/s 17.1961 KOps/s $\color{#d91a1a}-2.03\%$
test_compile_copy_nested[pytree-compile] 0.1887ms 80.7808μs 12.3792 KOps/s 12.5633 KOps/s $\color{#d91a1a}-1.47\%$
test_compile_copy_nested[pytree-eager] 0.1273ms 68.0432μs 14.6966 KOps/s 14.8903 KOps/s $\color{#d91a1a}-1.30\%$
test_compile_add_one_flat[tensordict-compile] 0.2314ms 0.1093ms 9.1488 KOps/s 9.3402 KOps/s $\color{#d91a1a}-2.05\%$
test_compile_add_one_flat[tensordict-eager] 0.3457ms 0.2176ms 4.5949 KOps/s 4.6297 KOps/s $\color{#d91a1a}-0.75\%$
test_compile_add_one_flat[tensorclass-compile] 0.1228ms 49.3895μs 20.2472 KOps/s 21.0201 KOps/s $\color{#d91a1a}-3.68\%$
test_compile_add_one_flat[tensorclass-eager] 0.1380ms 67.7745μs 14.7548 KOps/s 14.5109 KOps/s $\color{#35bf28}+1.68\%$
test_compile_add_one_flat[pytree-compile] 0.1847ms 0.1014ms 9.8663 KOps/s 9.8710 KOps/s $\color{#d91a1a}-0.05\%$
test_compile_add_one_flat[pytree-eager] 0.3215ms 0.2062ms 4.8506 KOps/s 4.9173 KOps/s $\color{#d91a1a}-1.36\%$
test_compile_add_self_flat[tensordict-eager] 0.8012ms 0.2351ms 4.2531 KOps/s 4.2976 KOps/s $\color{#d91a1a}-1.04\%$
test_compile_add_self_flat[tensordict-compile] 0.2265ms 0.1166ms 8.5755 KOps/s 9.3758 KOps/s $\textbf{\color{#d91a1a}-8.54\%}$
test_compile_add_self_flat[tensorclass-eager] 0.1502ms 64.9731μs 15.3910 KOps/s 15.8225 KOps/s $\color{#d91a1a}-2.73\%$
test_compile_add_self_flat[tensorclass-compile] 0.1247ms 50.1825μs 19.9273 KOps/s 20.8333 KOps/s $\color{#d91a1a}-4.35\%$
test_compile_add_self_flat[pytree-eager] 0.2552ms 0.1607ms 6.2244 KOps/s 6.3245 KOps/s $\color{#d91a1a}-1.58\%$
test_compile_add_self_flat[pytree-compile] 0.1743ms 0.1022ms 9.7880 KOps/s 9.8384 KOps/s $\color{#d91a1a}-0.51\%$
test_compile_copy_flat[tensordict-compile] 63.5290μs 21.8205μs 45.8284 KOps/s 46.0904 KOps/s $\color{#d91a1a}-0.57\%$
test_compile_copy_flat[tensordict-eager] 0.1316ms 66.0772μs 15.1338 KOps/s 15.0253 KOps/s $\color{#35bf28}+0.72\%$
test_compile_copy_flat[pytree-compile] 0.2065ms 87.5091μs 11.4274 KOps/s 11.5762 KOps/s $\color{#d91a1a}-1.29\%$
test_compile_copy_flat[pytree-eager] 0.1343ms 71.8640μs 13.9152 KOps/s 14.8656 KOps/s $\textbf{\color{#d91a1a}-6.39\%}$
test_compile_assign_and_add[tensordict-compile] 0.3055ms 0.2173ms 4.6023 KOps/s 4.6340 KOps/s $\color{#d91a1a}-0.68\%$
test_compile_assign_and_add[tensordict-eager] 1.6369ms 1.3881ms 720.3936 Ops/s 728.9236 Ops/s $\color{#d91a1a}-1.17\%$
test_compile_assign_and_add[pytree-compile] 0.2899ms 0.2123ms 4.7105 KOps/s 4.7453 KOps/s $\color{#d91a1a}-0.73\%$
test_compile_assign_and_add[pytree-eager] 1.0134ms 0.8347ms 1.1981 KOps/s 1.2157 KOps/s $\color{#d91a1a}-1.45\%$
test_compile_assign_and_add_stack[compile] 0.5765ms 0.4578ms 2.1845 KOps/s 2.1819 KOps/s $\color{#35bf28}+0.12\%$
test_compile_assign_and_add_stack[eager] 4.5282ms 2.7003ms 370.3227 Ops/s 364.6304 Ops/s $\color{#35bf28}+1.56\%$
test_compile_indexing[tensor-tensordict-compile] 95.4990μs 40.4745μs 24.7069 KOps/s 26.0375 KOps/s $\textbf{\color{#d91a1a}-5.11\%}$
test_compile_indexing[tensor-tensordict-eager] 0.6070ms 33.5641μs 29.7937 KOps/s 30.3834 KOps/s $\color{#d91a1a}-1.94\%$
test_compile_indexing[tensor-tensorclass-compile] 0.1122ms 31.7266μs 31.5193 KOps/s 32.3137 KOps/s $\color{#d91a1a}-2.46\%$
test_compile_indexing[tensor-tensorclass-eager] 64.4310μs 23.1473μs 43.2016 KOps/s 43.6877 KOps/s $\color{#d91a1a}-1.11\%$
test_compile_indexing[tensor-pytree-compile] 75.1010μs 32.0599μs 31.1916 KOps/s 31.2119 KOps/s $\color{#d91a1a}-0.06\%$
test_compile_indexing[tensor-pytree-eager] 75.0110μs 23.2937μs 42.9301 KOps/s 42.9650 KOps/s $\color{#d91a1a}-0.08\%$
test_compile_indexing[slice-tensordict-compile] 0.1147ms 53.9793μs 18.5256 KOps/s 18.8609 KOps/s $\color{#d91a1a}-1.78\%$
test_compile_indexing[slice-tensordict-eager] 0.4027ms 20.4224μs 48.9658 KOps/s 48.9926 KOps/s $\color{#d91a1a}-0.05\%$
test_compile_indexing[slice-tensorclass-compile] 98.3050μs 45.9262μs 21.7741 KOps/s 21.5214 KOps/s $\color{#35bf28}+1.17\%$
test_compile_indexing[slice-tensorclass-eager] 68.7690μs 18.9865μs 52.6689 KOps/s 53.3224 KOps/s $\color{#d91a1a}-1.23\%$
test_compile_indexing[slice-pytree-compile] 96.0210μs 45.9351μs 21.7698 KOps/s 21.1281 KOps/s $\color{#35bf28}+3.04\%$
test_compile_indexing[slice-pytree-eager] 86.8340μs 18.7820μs 53.2424 KOps/s 53.3678 KOps/s $\color{#d91a1a}-0.24\%$
test_compile_indexing[int-tensordict-compile] 0.1275ms 55.4872μs 18.0222 KOps/s 18.1411 KOps/s $\color{#d91a1a}-0.66\%$
test_compile_indexing[int-tensordict-eager] 1.0475ms 20.3501μs 49.1399 KOps/s 49.2289 KOps/s $\color{#d91a1a}-0.18\%$
test_compile_indexing[int-tensorclass-compile] 98.5150μs 46.6892μs 21.4182 KOps/s 21.1524 KOps/s $\color{#35bf28}+1.26\%$
test_compile_indexing[int-tensorclass-eager] 54.4120μs 19.0312μs 52.5453 KOps/s 53.4570 KOps/s $\color{#d91a1a}-1.71\%$
test_compile_indexing[int-pytree-compile] 0.1038ms 46.3698μs 21.5658 KOps/s 21.1860 KOps/s $\color{#35bf28}+1.79\%$
test_compile_indexing[int-pytree-eager] 57.8190μs 19.0913μs 52.3799 KOps/s 53.7745 KOps/s $\color{#d91a1a}-2.59\%$
test_mod_add[eager] 81.6630μs 36.8916μs 27.1065 KOps/s 26.6944 KOps/s $\color{#35bf28}+1.54\%$
test_mod_add[compile] 0.1202ms 66.3902μs 15.0625 KOps/s 15.1598 KOps/s $\color{#d91a1a}-0.64\%$
test_mod_add[compile-overhead] 0.1320ms 65.5845μs 15.2475 KOps/s 15.0375 KOps/s $\color{#35bf28}+1.40\%$
test_mod_wrap[eager] 0.4210ms 0.2175ms 4.5973 KOps/s 4.4467 KOps/s $\color{#35bf28}+3.39\%$
test_mod_wrap[compile] 1.9987ms 0.2285ms 4.3757 KOps/s 4.3625 KOps/s $\color{#35bf28}+0.30\%$
test_mod_wrap[compile-overhead] 0.4394ms 0.2243ms 4.4585 KOps/s 4.3591 KOps/s $\color{#35bf28}+2.28\%$
test_mod_wrap_and_backward[eager] 12.1795ms 10.7036ms 93.4269 Ops/s 81.4091 Ops/s $\textbf{\color{#35bf28}+14.76\%}$
test_mod_wrap_and_backward[compile] 12.2484ms 10.6486ms 93.9095 Ops/s 73.6011 Ops/s $\textbf{\color{#35bf28}+27.59\%}$
test_mod_wrap_and_backward[compile-overhead] 12.0009ms 10.7525ms 93.0016 Ops/s 88.1780 Ops/s $\textbf{\color{#35bf28}+5.47\%}$
test_seq_add[eager] 0.2580ms 0.1210ms 8.2636 KOps/s 8.2895 KOps/s $\color{#d91a1a}-0.31\%$
test_seq_add[compile] 0.1861ms 79.7859μs 12.5335 KOps/s 12.6660 KOps/s $\color{#d91a1a}-1.05\%$
test_seq_add[compile-overhead] 0.1716ms 77.5841μs 12.8892 KOps/s 13.1632 KOps/s $\color{#d91a1a}-2.08\%$
test_seq_wrap[eager] 0.7544ms 0.4534ms 2.2055 KOps/s 2.2189 KOps/s $\color{#d91a1a}-0.60\%$
test_seq_wrap[compile] 0.4411ms 0.2471ms 4.0476 KOps/s 4.1517 KOps/s $\color{#d91a1a}-2.51\%$
test_seq_wrap[compile-overhead] 0.3637ms 0.2437ms 4.1028 KOps/s 4.1394 KOps/s $\color{#d91a1a}-0.89\%$
test_func_call_runtime[False-eager] 0.9748ms 0.5398ms 1.8525 KOps/s 1.8817 KOps/s $\color{#d91a1a}-1.55\%$
test_func_call_runtime[False-compile] 0.5917ms 0.4433ms 2.2556 KOps/s 2.3071 KOps/s $\color{#d91a1a}-2.23\%$
test_func_call_runtime[False-compile-overhead] 0.7765ms 0.4449ms 2.2477 KOps/s 2.2892 KOps/s $\color{#d91a1a}-1.81\%$
test_func_call_runtime[True-eager] 0.9686ms 0.7547ms 1.3251 KOps/s 1.3386 KOps/s $\color{#d91a1a}-1.01\%$
test_func_call_runtime[True-compile] 0.5615ms 0.4634ms 2.1579 KOps/s 2.1917 KOps/s $\color{#d91a1a}-1.54\%$
test_func_call_runtime[True-compile-overhead] 0.6107ms 0.4657ms 2.1471 KOps/s 2.1939 KOps/s $\color{#d91a1a}-2.13\%$
test_func_call_cm_runtime[False-eager] 0.7310ms 0.5412ms 1.8476 KOps/s 1.8980 KOps/s $\color{#d91a1a}-2.66\%$
test_func_call_cm_runtime[False-compile] 0.7885ms 0.4415ms 2.2650 KOps/s 2.2963 KOps/s $\color{#d91a1a}-1.36\%$
test_func_call_cm_runtime[False-compile-overhead] 0.8070ms 0.4427ms 2.2590 KOps/s 2.2818 KOps/s $\color{#d91a1a}-1.00\%$
test_func_call_cm_runtime[True-eager] 1.4799ms 0.9108ms 1.0979 KOps/s 1.1133 KOps/s $\color{#d91a1a}-1.38\%$
test_func_call_cm_runtime[True-compile] 1.1266ms 0.8038ms 1.2441 KOps/s 1.2470 KOps/s $\color{#d91a1a}-0.24\%$
test_func_call_cm_runtime[True-compile-overhead] 0.9863ms 0.8041ms 1.2437 KOps/s 1.2454 KOps/s $\color{#d91a1a}-0.14\%$
test_vmap_func_call_cm_runtime[eager] 2.4973ms 1.9058ms 524.7040 Ops/s 522.3619 Ops/s $\color{#35bf28}+0.45\%$
test_vmap_func_call_cm_runtime[compile] 0.9860ms 0.5395ms 1.8537 KOps/s 1.8478 KOps/s $\color{#35bf28}+0.32\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.7350ms 0.5360ms 1.8658 KOps/s 1.8508 KOps/s $\color{#35bf28}+0.81\%$
test_distributed 0.2674ms 0.1242ms 8.0539 KOps/s 7.7565 KOps/s $\color{#35bf28}+3.83\%$
test_tdmodule 82.9860μs 28.0472μs 35.6542 KOps/s 35.4075 KOps/s $\color{#35bf28}+0.70\%$
test_tdmodule_dispatch 0.1250ms 50.4308μs 19.8291 KOps/s 19.6491 KOps/s $\color{#35bf28}+0.92\%$
test_tdseq 53.3710μs 29.9053μs 33.4389 KOps/s 33.3241 KOps/s $\color{#35bf28}+0.34\%$
test_tdseq_dispatch 93.1550μs 54.9752μs 18.1900 KOps/s 17.4974 KOps/s $\color{#35bf28}+3.96\%$
test_instantiation_functorch 1.7514ms 1.5428ms 648.1594 Ops/s 644.0340 Ops/s $\color{#35bf28}+0.64\%$
test_exec_functorch 0.3326ms 0.1776ms 5.6313 KOps/s 5.6270 KOps/s $\color{#35bf28}+0.08\%$
test_exec_functional_call 0.3507ms 0.1732ms 5.7750 KOps/s 5.7487 KOps/s $\color{#35bf28}+0.46\%$
test_exec_td_decorator 0.5039ms 0.2358ms 4.2417 KOps/s 4.0663 KOps/s $\color{#35bf28}+4.31\%$
test_vmap_mlp_speed_decorator[True-True] 1.1340ms 0.6533ms 1.5308 KOps/s 1.4945 KOps/s $\color{#35bf28}+2.43\%$
test_vmap_mlp_speed_decorator[True-False] 0.9633ms 0.6531ms 1.5312 KOps/s 1.4926 KOps/s $\color{#35bf28}+2.58\%$
test_vmap_mlp_speed_decorator[False-True] 0.8627ms 0.5268ms 1.8982 KOps/s 1.8464 KOps/s $\color{#35bf28}+2.81\%$
test_vmap_mlp_speed_decorator[False-False] 0.8012ms 0.5258ms 1.9019 KOps/s 1.8520 KOps/s $\color{#35bf28}+2.69\%$
test_to_module_speed[True] 2.2113ms 1.3580ms 736.3529 Ops/s 751.2175 Ops/s $\color{#d91a1a}-1.98\%$
test_to_module_speed[False] 2.1309ms 1.3102ms 763.2395 Ops/s 768.2586 Ops/s $\color{#d91a1a}-0.65\%$
test_tc_init 84.2080μs 45.6023μs 21.9287 KOps/s 21.1934 KOps/s $\color{#35bf28}+3.47\%$
test_tc_init_nested 0.1529ms 91.9610μs 10.8742 KOps/s 10.8031 KOps/s $\color{#35bf28}+0.66\%$
test_tc_first_layer_tensor 23.7840μs 1.6068μs 622.3584 KOps/s 648.5649 KOps/s $\color{#d91a1a}-4.04\%$
test_tc_first_layer_nontensor 44.5240μs 4.7157μs 212.0564 KOps/s 213.7545 KOps/s $\color{#d91a1a}-0.79\%$
test_tc_second_layer_tensor 23.0040μs 2.8286μs 353.5286 KOps/s 350.4411 KOps/s $\color{#35bf28}+0.88\%$
test_tc_second_layer_nontensor 27.2310μs 6.1689μs 162.1024 KOps/s 164.2549 KOps/s $\color{#d91a1a}-1.31\%$
test_unbind 0.2238s 15.1539ms 65.9895 Ops/s 66.8702 Ops/s $\color{#d91a1a}-1.32\%$
test_full_like 9.4450ms 7.5359ms 132.6978 Ops/s 144.1254 Ops/s $\textbf{\color{#d91a1a}-7.93\%}$
test_zeros_like 5.4726ms 4.4099ms 226.7647 Ops/s 368.7062 Ops/s $\textbf{\color{#d91a1a}-38.50\%}$
test_ones_like 3.7799ms 3.2317ms 309.4317 Ops/s 318.9434 Ops/s $\color{#d91a1a}-2.98\%$
test_clone 5.4082ms 4.8456ms 206.3730 Ops/s 206.6602 Ops/s $\color{#d91a1a}-0.14\%$
test_squeeze 59.7220μs 12.5072μs 79.9541 KOps/s 78.5400 KOps/s $\color{#35bf28}+1.80\%$
test_unsqueeze 0.2832ms 94.0433μs 10.6334 KOps/s 10.3080 KOps/s $\color{#35bf28}+3.16\%$
test_split 0.3381ms 0.1935ms 5.1687 KOps/s 5.1011 KOps/s $\color{#35bf28}+1.33\%$
test_permute 0.2805ms 0.1998ms 5.0055 KOps/s 4.7842 KOps/s $\color{#35bf28}+4.62\%$
test_stack 31.3010ms 25.1849ms 39.7063 Ops/s 41.8659 Ops/s $\textbf{\color{#d91a1a}-5.16\%}$
test_cat 30.4385ms 25.0856ms 39.8636 Ops/s 41.6260 Ops/s $\color{#d91a1a}-4.23\%$

Copy link

github-actions bot commented Mar 5, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 229. Improved: $\large\color{#35bf28}33$. Worsened: $\large\color{#d91a1a}14$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 38.4310μs 11.6411μs 85.9025 KOps/s 76.7682 KOps/s $\textbf{\color{#35bf28}+11.90\%}$
test_plain_set_stack_nested 37.1800μs 11.6480μs 85.8513 KOps/s 76.3308 KOps/s $\textbf{\color{#35bf28}+12.47\%}$
test_plain_set_nested_inplace 88.2210μs 12.6238μs 79.2157 KOps/s 70.6369 KOps/s $\textbf{\color{#35bf28}+12.14\%}$
test_plain_set_stack_nested_inplace 40.9700μs 12.5432μs 79.7247 KOps/s 71.9049 KOps/s $\textbf{\color{#35bf28}+10.88\%}$
test_items 28.9510μs 2.8633μs 349.2413 KOps/s 345.0211 KOps/s $\color{#35bf28}+1.22\%$
test_items_nested 0.3862ms 0.3602ms 2.7762 KOps/s 2.7468 KOps/s $\color{#35bf28}+1.07\%$
test_items_nested_locked 0.5564ms 0.3654ms 2.7364 KOps/s 2.7331 KOps/s $\color{#35bf28}+0.12\%$
test_items_nested_leaf 87.3210μs 60.4825μs 16.5337 KOps/s 16.5600 KOps/s $\color{#d91a1a}-0.16\%$
test_items_stack_nested 0.4067ms 0.3574ms 2.7982 KOps/s 2.7588 KOps/s $\color{#35bf28}+1.43\%$
test_items_stack_nested_leaf 98.8720μs 60.4187μs 16.5512 KOps/s 16.5080 KOps/s $\color{#35bf28}+0.26\%$
test_items_stack_nested_locked 0.4112ms 0.3572ms 2.7995 KOps/s 2.7357 KOps/s $\color{#35bf28}+2.33\%$
test_keys 31.6000μs 3.4103μs 293.2282 KOps/s 294.1511 KOps/s $\color{#d91a1a}-0.31\%$
test_keys_nested 0.1608ms 87.7121μs 11.4009 KOps/s 11.3636 KOps/s $\color{#35bf28}+0.33\%$
test_keys_nested_locked 0.8647ms 92.6874μs 10.7890 KOps/s 10.6624 KOps/s $\color{#35bf28}+1.19\%$
test_keys_nested_leaf 0.1163ms 78.4087μs 12.7537 KOps/s 12.6300 KOps/s $\color{#35bf28}+0.98\%$
test_keys_stack_nested 0.1172ms 86.8270μs 11.5172 KOps/s 11.3543 KOps/s $\color{#35bf28}+1.43\%$
test_keys_stack_nested_leaf 0.1092ms 78.3908μs 12.7566 KOps/s 12.6660 KOps/s $\color{#35bf28}+0.71\%$
test_keys_stack_nested_locked 0.2669ms 92.5341μs 10.8068 KOps/s 10.6287 KOps/s $\color{#35bf28}+1.68\%$
test_values 28.5637μs 0.8529μs 1.1724 MOps/s 1.1221 MOps/s $\color{#35bf28}+4.49\%$
test_values_nested 68.8700μs 36.9496μs 27.0639 KOps/s 26.7420 KOps/s $\color{#35bf28}+1.20\%$
test_values_nested_locked 0.2247ms 39.0545μs 25.6053 KOps/s 25.4741 KOps/s $\color{#35bf28}+0.51\%$
test_values_nested_leaf 68.2910μs 41.8297μs 23.9065 KOps/s 23.3651 KOps/s $\color{#35bf28}+2.32\%$
test_values_stack_nested 62.3010μs 37.0176μs 27.0142 KOps/s 26.6454 KOps/s $\color{#35bf28}+1.38\%$
test_values_stack_nested_leaf 78.4520μs 42.2917μs 23.6453 KOps/s 23.5485 KOps/s $\color{#35bf28}+0.41\%$
test_values_stack_nested_locked 63.6110μs 38.9715μs 25.6598 KOps/s 25.5704 KOps/s $\color{#35bf28}+0.35\%$
test_membership 1.6775μs 0.5034μs 1.9864 MOps/s 1.9955 MOps/s $\color{#d91a1a}-0.46\%$
test_membership_nested 16.4455μs 1.9658μs 508.6982 KOps/s 519.4168 KOps/s $\color{#d91a1a}-2.06\%$
test_membership_nested_leaf 15.9255μs 2.0273μs 493.2651 KOps/s 513.0461 KOps/s $\color{#d91a1a}-3.86\%$
test_membership_stacked_nested 0.1206ms 2.0401μs 490.1807 KOps/s 487.0673 KOps/s $\color{#35bf28}+0.64\%$
test_membership_stacked_nested_leaf 32.6110μs 2.0415μs 489.8368 KOps/s 501.9906 KOps/s $\color{#d91a1a}-2.42\%$
test_membership_nested_last 31.4700μs 2.9572μs 338.1575 KOps/s 333.2827 KOps/s $\color{#35bf28}+1.46\%$
test_membership_nested_leaf_last 29.5600μs 2.9777μs 335.8270 KOps/s 334.2296 KOps/s $\color{#35bf28}+0.48\%$
test_membership_stacked_nested_last 56.2210μs 2.9967μs 333.6958 KOps/s 333.3653 KOps/s $\color{#35bf28}+0.10\%$
test_membership_stacked_nested_leaf_last 58.0310μs 2.9340μs 340.8352 KOps/s 337.6922 KOps/s $\color{#35bf28}+0.93\%$
test_nested_getleaf 34.6210μs 6.2122μs 160.9742 KOps/s 161.5307 KOps/s $\color{#d91a1a}-0.34\%$
test_nested_get 34.3010μs 5.8699μs 170.3619 KOps/s 168.1903 KOps/s $\color{#35bf28}+1.29\%$
test_stacked_getleaf 45.0810μs 6.1187μs 163.4331 KOps/s 164.1820 KOps/s $\color{#d91a1a}-0.46\%$
test_stacked_get 31.7300μs 5.7674μs 173.3889 KOps/s 173.0973 KOps/s $\color{#35bf28}+0.17\%$
test_nested_getitemleaf 31.3900μs 6.3591μs 157.2560 KOps/s 157.9735 KOps/s $\color{#d91a1a}-0.45\%$
test_nested_getitem 28.5700μs 6.0250μs 165.9741 KOps/s 164.9380 KOps/s $\color{#35bf28}+0.63\%$
test_stacked_getitemleaf 40.9600μs 6.2739μs 159.3898 KOps/s 157.6746 KOps/s $\color{#35bf28}+1.09\%$
test_stacked_getitem 37.1800μs 5.8964μs 169.5950 KOps/s 167.8031 KOps/s $\color{#35bf28}+1.07\%$
test_lock_nested 9.2536ms 0.3475ms 2.8774 KOps/s 2.9931 KOps/s $\color{#d91a1a}-3.86\%$
test_lock_stack_nested 0.4122ms 0.3425ms 2.9194 KOps/s 2.8897 KOps/s $\color{#35bf28}+1.03\%$
test_unlock_nested 0.3811ms 0.2810ms 3.5582 KOps/s 3.5901 KOps/s $\color{#d91a1a}-0.89\%$
test_unlock_stack_nested 0.3783ms 0.2809ms 3.5603 KOps/s 3.5693 KOps/s $\color{#d91a1a}-0.25\%$
test_flatten_speed 0.1125ms 76.6332μs 13.0492 KOps/s 12.8236 KOps/s $\color{#35bf28}+1.76\%$
test_unflatten_speed 0.3621ms 0.3178ms 3.1463 KOps/s 3.0909 KOps/s $\color{#35bf28}+1.79\%$
test_common_ops 0.8018ms 0.5955ms 1.6792 KOps/s 1.5730 KOps/s $\textbf{\color{#35bf28}+6.75\%}$
test_creation 98.9610μs 1.6805μs 595.0614 KOps/s 589.9554 KOps/s $\color{#35bf28}+0.87\%$
test_creation_empty 30.1010μs 6.3442μs 157.6242 KOps/s 106.2976 KOps/s $\textbf{\color{#35bf28}+48.29\%}$
test_creation_nested_1 37.2100μs 7.9826μs 125.2729 KOps/s 91.0539 KOps/s $\textbf{\color{#35bf28}+37.58\%}$
test_creation_nested_2 47.6110μs 10.6483μs 93.9113 KOps/s 74.5617 KOps/s $\textbf{\color{#35bf28}+25.95\%}$
test_clone 0.1587ms 11.7752μs 84.9241 KOps/s 94.6448 KOps/s $\textbf{\color{#d91a1a}-10.27\%}$
test_getitem[int] 1.2685ms 10.5738μs 94.5734 KOps/s 96.5272 KOps/s $\color{#d91a1a}-2.02\%$
test_getitem[slice_int] 0.1201ms 20.0526μs 49.8690 KOps/s 49.5686 KOps/s $\color{#35bf28}+0.61\%$
test_getitem[range] 0.2186ms 37.8803μs 26.3989 KOps/s 27.2188 KOps/s $\color{#d91a1a}-3.01\%$
test_getitem[tuple] 0.1096ms 17.2672μs 57.9132 KOps/s 56.0649 KOps/s $\color{#35bf28}+3.30\%$
test_getitem[list] 0.1533ms 32.1247μs 31.1287 KOps/s 31.4723 KOps/s $\color{#d91a1a}-1.09\%$
test_setitem_dim[int] 64.2910μs 18.2983μs 54.6500 KOps/s 55.5395 KOps/s $\color{#d91a1a}-1.60\%$
test_setitem_dim[slice_int] 57.6400μs 36.5982μs 27.3238 KOps/s 26.8631 KOps/s $\color{#35bf28}+1.71\%$
test_setitem_dim[range] 0.2035ms 51.3949μs 19.4572 KOps/s 19.3474 KOps/s $\color{#35bf28}+0.57\%$
test_setitem_dim[tuple] 50.1710μs 31.6459μs 31.5997 KOps/s 31.5746 KOps/s $\color{#35bf28}+0.08\%$
test_setitem 0.1437ms 13.9944μs 71.4572 KOps/s 64.2192 KOps/s $\textbf{\color{#35bf28}+11.27\%}$
test_set 0.1174ms 13.1543μs 76.0208 KOps/s 64.9240 KOps/s $\textbf{\color{#35bf28}+17.09\%}$
test_set_shared 0.5112ms 0.1578ms 6.3383 KOps/s 6.4510 KOps/s $\color{#d91a1a}-1.75\%$
test_update 0.4148ms 16.9931μs 58.8475 KOps/s 48.5779 KOps/s $\textbf{\color{#35bf28}+21.14\%}$
test_update_nested 0.1491ms 25.0711μs 39.8865 KOps/s 32.9007 KOps/s $\textbf{\color{#35bf28}+21.23\%}$
test_update__nested 0.4783ms 25.0321μs 39.9487 KOps/s 36.7181 KOps/s $\textbf{\color{#35bf28}+8.80\%}$
test_set_nested 0.1044ms 14.7785μs 67.6659 KOps/s 56.0831 KOps/s $\textbf{\color{#35bf28}+20.65\%}$
test_set_nested_new 49.2410μs 16.9631μs 58.9513 KOps/s 49.5967 KOps/s $\textbf{\color{#35bf28}+18.86\%}$
test_select 0.1633ms 29.3000μs 34.1296 KOps/s 32.2621 KOps/s $\textbf{\color{#35bf28}+5.79\%}$
test_select_nested 71.5010μs 42.9363μs 23.2903 KOps/s 23.2761 KOps/s $\color{#35bf28}+0.06\%$
test_exclude_nested 0.2447ms 61.3955μs 16.2878 KOps/s 16.1328 KOps/s $\color{#35bf28}+0.96\%$
test_empty[True] 0.3217ms 0.2926ms 3.4181 KOps/s 3.4156 KOps/s $\color{#35bf28}+0.07\%$
test_empty[False] 20.0243μs 0.8165μs 1.2247 MOps/s 1.2166 MOps/s $\color{#35bf28}+0.67\%$
test_to 83.9210μs 55.2066μs 18.1138 KOps/s 18.3570 KOps/s $\color{#d91a1a}-1.33\%$
test_to_nonblocking 0.2473ms 46.3134μs 21.5920 KOps/s 21.5466 KOps/s $\color{#35bf28}+0.21\%$
test_unbind_speed 0.2765ms 0.2416ms 4.1398 KOps/s 4.2030 KOps/s $\color{#d91a1a}-1.50\%$
test_unbind_speed_stack0 0.3966ms 0.2386ms 4.1918 KOps/s 4.1931 KOps/s $\color{#d91a1a}-0.03\%$
test_unbind_speed_stack1 93.6520ms 0.8226ms 1.2157 KOps/s 1.3513 KOps/s $\textbf{\color{#d91a1a}-10.04\%}$
test_split 95.0784ms 1.5935ms 627.5485 Ops/s 633.5928 Ops/s $\color{#d91a1a}-0.95\%$
test_chunk 97.2196ms 1.6178ms 618.1201 Ops/s 630.5744 Ops/s $\color{#d91a1a}-1.98\%$
test_consolidate[False-None] 2.8552ms 2.6728ms 374.1377 Ops/s 332.5694 Ops/s $\textbf{\color{#35bf28}+12.50\%}$
test_consolidate[default-None] 1.8143ms 1.6854ms 593.3365 Ops/s 600.3551 Ops/s $\color{#d91a1a}-1.17\%$
test_consolidate[reduce-overhead-None] 1.8941ms 1.7277ms 578.8109 Ops/s 581.9583 Ops/s $\color{#d91a1a}-0.54\%$
test_consolidate_njt[False-None] 6.7009ms 6.4961ms 153.9385 Ops/s 153.3408 Ops/s $\color{#35bf28}+0.39\%$
test_to[False-False-None] 1.9462ms 1.7631ms 567.1984 Ops/s 580.8965 Ops/s $\color{#d91a1a}-2.36\%$
test_to[True-False-None] 1.6467ms 1.3401ms 746.2277 Ops/s 746.7935 Ops/s $\color{#d91a1a}-0.08\%$
test_to[within-False-None] 4.3253ms 4.1496ms 240.9896 Ops/s 238.6859 Ops/s $\color{#35bf28}+0.97\%$
test_to[True-default-None] 5.4092ms 5.1266ms 195.0606 Ops/s 196.1591 Ops/s $\color{#d91a1a}-0.56\%$
test_to_njt[False-False-None] 7.0472ms 6.8621ms 145.7270 Ops/s 145.9664 Ops/s $\color{#d91a1a}-0.16\%$
test_to_njt[True-False-None] 5.9131ms 5.5187ms 181.2030 Ops/s 185.6768 Ops/s $\color{#d91a1a}-2.41\%$
test_to_njt[within-False-None] 12.1527ms 11.8215ms 84.5920 Ops/s 82.5157 Ops/s $\color{#35bf28}+2.52\%$
test_creation[device0] 0.5407ms 79.9439μs 12.5088 KOps/s 11.9691 KOps/s $\color{#35bf28}+4.51\%$
test_creation_from_tensor 0.5389ms 83.4759μs 11.9795 KOps/s 11.4788 KOps/s $\color{#35bf28}+4.36\%$
test_add_one[memmap_tensor0] 0.2309ms 6.8712μs 145.5342 KOps/s 152.5834 KOps/s $\color{#d91a1a}-4.62\%$
test_contiguous[memmap_tensor0] 1.7731μs 0.4304μs 2.3233 MOps/s 2.3658 MOps/s $\color{#d91a1a}-1.80\%$
test_stack[memmap_tensor0] 38.9900μs 4.2800μs 233.6471 KOps/s 236.0080 KOps/s $\color{#d91a1a}-1.00\%$
test_memmaptd_index 1.4349ms 0.2328ms 4.2956 KOps/s 4.2652 KOps/s $\color{#35bf28}+0.71\%$
test_memmaptd_index_astensor 0.4299ms 0.2959ms 3.3790 KOps/s 3.3975 KOps/s $\color{#d91a1a}-0.55\%$
test_memmaptd_index_op 0.6983ms 0.5419ms 1.8453 KOps/s 1.7101 KOps/s $\textbf{\color{#35bf28}+7.91\%}$
test_serialize_model 0.1329s 0.1319s 7.5799 Ops/s 7.5742 Ops/s $\color{#35bf28}+0.08\%$
test_serialize_model_pickle 1.3506s 1.2146s 0.8233 Ops/s 0.8200 Ops/s $\color{#35bf28}+0.40\%$
test_serialize_weights 0.1324s 0.1315s 7.6074 Ops/s 7.6109 Ops/s $\color{#d91a1a}-0.05\%$
test_serialize_weights_returnearly 0.4714s 70.0519ms 14.2751 Ops/s 15.1619 Ops/s $\textbf{\color{#d91a1a}-5.85\%}$
test_serialize_weights_pickle 1.3744s 1.2200s 0.8196 Ops/s 0.8199 Ops/s $\color{#d91a1a}-0.03\%$
test_reshape_pytree 99.8010μs 21.7272μs 46.0252 KOps/s 45.0050 KOps/s $\color{#35bf28}+2.27\%$
test_reshape_td 0.1557ms 26.2481μs 38.0979 KOps/s 37.0480 KOps/s $\color{#35bf28}+2.83\%$
test_view_pytree 0.1631ms 21.6460μs 46.1980 KOps/s 46.0444 KOps/s $\color{#35bf28}+0.33\%$
test_view_td 0.1142ms 30.2639μs 33.0427 KOps/s 30.8833 KOps/s $\textbf{\color{#35bf28}+6.99\%}$
test_unbind_pytree 0.2152ms 28.0461μs 35.6556 KOps/s 36.6051 KOps/s $\color{#d91a1a}-2.59\%$
test_unbind_td 0.7939ms 36.7345μs 27.2224 KOps/s 27.2111 KOps/s $\color{#35bf28}+0.04\%$
test_split_pytree 75.0700μs 29.5898μs 33.7955 KOps/s 33.2829 KOps/s $\color{#35bf28}+1.54\%$
test_split_td 0.9702ms 38.0147μs 26.3056 KOps/s 26.1614 KOps/s $\color{#35bf28}+0.55\%$
test_add_pytree 0.1633ms 35.1468μs 28.4521 KOps/s 29.2949 KOps/s $\color{#d91a1a}-2.88\%$
test_add_td 0.1691ms 45.3709μs 22.0406 KOps/s 19.8222 KOps/s $\textbf{\color{#35bf28}+11.19\%}$
test_compile_add_one_nested[tensordict-compile] 0.2896ms 0.1309ms 7.6379 KOps/s 8.3429 KOps/s $\textbf{\color{#d91a1a}-8.45\%}$
test_compile_add_one_nested[tensordict-eager] 0.2832ms 0.1326ms 7.5418 KOps/s 7.5421 KOps/s $-0.00\%$
test_compile_add_one_nested[pytree-compile] 0.2415ms 96.4893μs 10.3638 KOps/s 10.7454 KOps/s $\color{#d91a1a}-3.55\%$
test_compile_add_one_nested[pytree-eager] 0.3026ms 0.1527ms 6.5493 KOps/s 6.5789 KOps/s $\color{#d91a1a}-0.45\%$
test_compile_copy_nested[tensordict-compile] 0.1640ms 25.4172μs 39.3434 KOps/s 34.2847 KOps/s $\textbf{\color{#35bf28}+14.75\%}$
test_compile_copy_nested[tensordict-eager] 0.1529ms 29.1389μs 34.3184 KOps/s 33.8727 KOps/s $\color{#35bf28}+1.32\%$
test_compile_copy_nested[pytree-compile] 0.4031ms 63.8509μs 15.6615 KOps/s 15.7461 KOps/s $\color{#d91a1a}-0.54\%$
test_compile_copy_nested[pytree-eager] 84.7810μs 48.4970μs 20.6199 KOps/s 20.4205 KOps/s $\color{#35bf28}+0.98\%$
test_compile_add_one_flat[tensordict-compile] 0.2919ms 0.1439ms 6.9487 KOps/s 7.0819 KOps/s $\color{#d91a1a}-1.88\%$
test_compile_add_one_flat[tensordict-eager] 0.3624ms 0.2149ms 4.6535 KOps/s 4.6476 KOps/s $\color{#35bf28}+0.13\%$
test_compile_add_one_flat[tensorclass-compile] 0.2754ms 0.1030ms 9.7116 KOps/s 10.3337 KOps/s $\textbf{\color{#d91a1a}-6.02\%}$
test_compile_add_one_flat[tensorclass-eager] 0.2304ms 56.9985μs 17.5443 KOps/s 17.3946 KOps/s $\color{#35bf28}+0.86\%$
test_compile_add_one_flat[pytree-compile] 0.2887ms 0.1375ms 7.2704 KOps/s 7.3160 KOps/s $\color{#d91a1a}-0.62\%$
test_compile_add_one_flat[pytree-eager] 0.6628ms 0.4949ms 2.0207 KOps/s 1.9121 KOps/s $\textbf{\color{#35bf28}+5.68\%}$
test_compile_add_self_flat[tensordict-eager] 0.4307ms 0.2600ms 3.8458 KOps/s 3.8317 KOps/s $\color{#35bf28}+0.37\%$
test_compile_add_self_flat[tensordict-compile] 0.3369ms 0.1527ms 6.5485 KOps/s 6.9860 KOps/s $\textbf{\color{#d91a1a}-6.26\%}$
test_compile_add_self_flat[tensorclass-eager] 0.2477ms 69.4274μs 14.4035 KOps/s 14.8308 KOps/s $\color{#d91a1a}-2.88\%$
test_compile_add_self_flat[tensorclass-compile] 0.2789ms 0.1050ms 9.5268 KOps/s 10.2698 KOps/s $\textbf{\color{#d91a1a}-7.23\%}$
test_compile_add_self_flat[pytree-eager] 0.5811ms 0.4201ms 2.3805 KOps/s 2.3910 KOps/s $\color{#d91a1a}-0.44\%$
test_compile_add_self_flat[pytree-compile] 0.2819ms 0.1376ms 7.2676 KOps/s 7.3322 KOps/s $\color{#d91a1a}-0.88\%$
test_compile_copy_flat[tensordict-compile] 0.1584ms 19.3750μs 51.6129 KOps/s 52.4188 KOps/s $\color{#d91a1a}-1.54\%$
test_compile_copy_flat[tensordict-eager] 66.4410μs 30.7901μs 32.4779 KOps/s 31.8604 KOps/s $\color{#35bf28}+1.94\%$
test_compile_copy_flat[pytree-compile] 0.1976ms 68.4821μs 14.6024 KOps/s 14.6817 KOps/s $\color{#d91a1a}-0.54\%$
test_compile_copy_flat[pytree-eager] 0.1100ms 51.6069μs 19.3772 KOps/s 19.4563 KOps/s $\color{#d91a1a}-0.41\%$
test_compile_assign_and_add[tensordict-compile] 1.6301ms 0.3932ms 2.5431 KOps/s 2.2621 KOps/s $\textbf{\color{#35bf28}+12.42\%}$
test_compile_assign_and_add[tensordict-eager] 2.8526ms 2.6615ms 375.7261 Ops/s 363.7962 Ops/s $\color{#35bf28}+3.28\%$
test_compile_assign_and_add[pytree-compile] 1.5690ms 0.4283ms 2.3346 KOps/s 2.2622 KOps/s $\color{#35bf28}+3.20\%$
test_compile_assign_and_add[pytree-eager] 2.9028ms 2.7098ms 369.0272 Ops/s 371.0054 Ops/s $\color{#d91a1a}-0.53\%$
test_compile_indexing[tensor-tensordict-compile] 0.7273ms 0.1132ms 8.8332 KOps/s 8.5014 KOps/s $\color{#35bf28}+3.90\%$
test_compile_indexing[tensor-tensordict-eager] 0.5510ms 79.2465μs 12.6189 KOps/s 12.3356 KOps/s $\color{#35bf28}+2.30\%$
test_compile_indexing[tensor-tensorclass-compile] 0.6015ms 0.1062ms 9.4162 KOps/s 9.1588 KOps/s $\color{#35bf28}+2.81\%$
test_compile_indexing[tensor-tensorclass-eager] 0.2641ms 72.0049μs 13.8879 KOps/s 14.7226 KOps/s $\textbf{\color{#d91a1a}-5.67\%}$
test_compile_indexing[tensor-pytree-compile] 0.3148ms 0.1130ms 8.8493 KOps/s 9.1461 KOps/s $\color{#d91a1a}-3.25\%$
test_compile_indexing[tensor-pytree-eager] 0.2609ms 72.0439μs 13.8804 KOps/s 14.1736 KOps/s $\color{#d91a1a}-2.07\%$
test_compile_indexing[slice-tensordict-compile] 0.2470ms 98.1812μs 10.1852 KOps/s 10.1710 KOps/s $\color{#35bf28}+0.14\%$
test_compile_indexing[slice-tensordict-eager] 0.1420ms 16.7165μs 59.8213 KOps/s 57.7945 KOps/s $\color{#35bf28}+3.51\%$
test_compile_indexing[slice-tensorclass-compile] 0.2446ms 94.4868μs 10.5835 KOps/s 10.5128 KOps/s $\color{#35bf28}+0.67\%$
test_compile_indexing[slice-tensorclass-eager] 0.1541ms 15.4919μs 64.5499 KOps/s 65.6308 KOps/s $\color{#d91a1a}-1.65\%$
test_compile_indexing[slice-pytree-compile] 0.2482ms 93.7632μs 10.6652 KOps/s 10.4241 KOps/s $\color{#35bf28}+2.31\%$
test_compile_indexing[slice-pytree-eager] 0.1343ms 15.4481μs 64.7327 KOps/s 65.1323 KOps/s $\color{#d91a1a}-0.61\%$
test_compile_indexing[int-tensordict-compile] 0.2511ms 99.5742μs 10.0428 KOps/s 10.0269 KOps/s $\color{#35bf28}+0.16\%$
test_compile_indexing[int-tensordict-eager] 0.6686ms 16.8487μs 59.3518 KOps/s 59.2940 KOps/s $\color{#35bf28}+0.10\%$
test_compile_indexing[int-tensorclass-compile] 0.2734ms 94.9024μs 10.5371 KOps/s 10.4256 KOps/s $\color{#35bf28}+1.07\%$
test_compile_indexing[int-tensorclass-eager] 0.1623ms 15.5942μs 64.1264 KOps/s 65.1281 KOps/s $\color{#d91a1a}-1.54\%$
test_compile_indexing[int-pytree-compile] 0.2487ms 94.3372μs 10.6003 KOps/s 10.4368 KOps/s $\color{#35bf28}+1.57\%$
test_compile_indexing[int-pytree-eager] 0.1342ms 16.1829μs 61.7937 KOps/s 65.4396 KOps/s $\textbf{\color{#d91a1a}-5.57\%}$
test_mod_add[eager] 0.2176ms 37.8430μs 26.4250 KOps/s 25.2875 KOps/s $\color{#35bf28}+4.50\%$
test_mod_add[compile] 0.4412ms 80.6421μs 12.4005 KOps/s 12.4110 KOps/s $\color{#d91a1a}-0.09\%$
test_mod_add[compile-overhead] 0.3383ms 0.1676ms 5.9649 KOps/s 5.7151 KOps/s $\color{#35bf28}+4.37\%$
test_mod_wrap[eager] 0.3896ms 0.2441ms 4.0971 KOps/s 4.0203 KOps/s $\color{#35bf28}+1.91\%$
test_mod_wrap[compile] 0.4781ms 0.2874ms 3.4800 KOps/s 3.4919 KOps/s $\color{#d91a1a}-0.34\%$
test_mod_wrap[compile-overhead] 7.2698ms 3.8926ms 256.8989 Ops/s 266.8847 Ops/s $\color{#d91a1a}-3.74\%$
test_mod_wrap_and_backward[eager] 1.5856ms 1.3872ms 720.9016 Ops/s 678.9995 Ops/s $\textbf{\color{#35bf28}+6.17\%}$
test_mod_wrap_and_backward[compile] 1.5865ms 1.3542ms 738.4252 Ops/s 722.6338 Ops/s $\color{#35bf28}+2.19\%$
test_mod_wrap_and_backward[compile-overhead] 1.3823ms 0.9355ms 1.0690 KOps/s 972.2913 Ops/s $\textbf{\color{#35bf28}+9.94\%}$
test_seq_add[eager] 0.2572ms 0.1115ms 8.9656 KOps/s 8.4933 KOps/s $\textbf{\color{#35bf28}+5.56\%}$
test_seq_add[compile] 0.3301ms 88.0409μs 11.3584 KOps/s 10.9557 KOps/s $\color{#35bf28}+3.68\%$
test_seq_add[compile-overhead] 0.2998ms 0.1287ms 7.7680 KOps/s 7.7726 KOps/s $\color{#d91a1a}-0.06\%$
test_seq_wrap[eager] 0.5657ms 0.4082ms 2.4501 KOps/s 2.3104 KOps/s $\textbf{\color{#35bf28}+6.05\%}$
test_seq_wrap[compile] 0.4411ms 0.3016ms 3.3156 KOps/s 3.2891 KOps/s $\color{#35bf28}+0.80\%$
test_seq_wrap[compile-overhead] 0.4091ms 0.2230ms 4.4843 KOps/s 4.4536 KOps/s $\color{#35bf28}+0.69\%$
test_func_call_runtime[False-eager] 0.8836ms 0.7347ms 1.3611 KOps/s 1.3362 KOps/s $\color{#35bf28}+1.87\%$
test_func_call_runtime[False-compile] 0.9332ms 0.7376ms 1.3557 KOps/s 1.3357 KOps/s $\color{#35bf28}+1.49\%$
test_func_call_runtime[False-compile-overhead] 0.5093ms 0.3605ms 2.7736 KOps/s 2.7795 KOps/s $\color{#d91a1a}-0.21\%$
test_func_call_runtime[True-eager] 1.0972ms 0.8974ms 1.1144 KOps/s 1.1147 KOps/s $\color{#d91a1a}-0.03\%$
test_func_call_runtime[True-compile] 1.0652ms 0.7634ms 1.3100 KOps/s 1.2972 KOps/s $\color{#35bf28}+0.99\%$
test_func_call_runtime[True-compile-overhead] 0.5272ms 0.3811ms 2.6243 KOps/s 2.6459 KOps/s $\color{#d91a1a}-0.82\%$
test_func_call_cm_runtime[False-eager] 0.9033ms 0.7329ms 1.3644 KOps/s 1.3680 KOps/s $\color{#d91a1a}-0.26\%$
test_func_call_cm_runtime[False-compile] 0.9846ms 0.7430ms 1.3458 KOps/s 1.3390 KOps/s $\color{#35bf28}+0.51\%$
test_func_call_cm_runtime[False-compile-overhead] 0.5068ms 0.3598ms 2.7792 KOps/s 2.7834 KOps/s $\color{#d91a1a}-0.15\%$
test_func_call_cm_runtime[True-eager] 1.1708ms 1.0079ms 992.1498 Ops/s 994.5044 Ops/s $\color{#d91a1a}-0.24\%$
test_func_call_cm_runtime[True-compile] 1.1466ms 0.9875ms 1.0127 KOps/s 984.0238 Ops/s $\color{#35bf28}+2.91\%$
test_func_call_cm_runtime[True-compile-overhead] 1.1438ms 1.0014ms 998.6396 Ops/s 1.0040 KOps/s $\color{#d91a1a}-0.54\%$
test_vmap_func_call_cm_runtime[eager] 2.5051ms 2.1056ms 474.9266 Ops/s 469.7736 Ops/s $\color{#35bf28}+1.10\%$
test_vmap_func_call_cm_runtime[compile] 0.9536ms 0.8007ms 1.2489 KOps/s 1.2252 KOps/s $\color{#35bf28}+1.94\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.5532ms 0.4114ms 2.4309 KOps/s 2.4212 KOps/s $\color{#35bf28}+0.40\%$
test_distributed 11.3036ms 0.2017ms 4.9588 KOps/s 8.6627 KOps/s $\textbf{\color{#d91a1a}-42.76\%}$
test_tdmodule 0.1589ms 19.2132μs 52.0476 KOps/s 47.8017 KOps/s $\textbf{\color{#35bf28}+8.88\%}$
test_tdmodule_dispatch 0.1122ms 33.7653μs 29.6162 KOps/s 26.6344 KOps/s $\textbf{\color{#35bf28}+11.20\%}$
test_tdseq 0.2267ms 19.0126μs 52.5966 KOps/s 47.1684 KOps/s $\textbf{\color{#35bf28}+11.51\%}$
test_tdseq_dispatch 55.6710μs 35.7803μs 27.9483 KOps/s 25.1111 KOps/s $\textbf{\color{#35bf28}+11.30\%}$
test_instantiation_functorch 1.7454ms 1.5074ms 663.3987 Ops/s 668.2704 Ops/s $\color{#d91a1a}-0.73\%$
test_exec_functorch 0.3462ms 0.1423ms 7.0259 KOps/s 7.1512 KOps/s $\color{#d91a1a}-1.75\%$
test_exec_functional_call 0.3317ms 0.1379ms 7.2491 KOps/s 7.6406 KOps/s $\textbf{\color{#d91a1a}-5.12\%}$
test_exec_td_decorator 0.3877ms 0.1881ms 5.3165 KOps/s 5.4925 KOps/s $\color{#d91a1a}-3.20\%$
test_vmap_mlp_speed_decorator[True-True] 0.8871ms 0.6919ms 1.4453 KOps/s 1.4452 KOps/s $+0.01\%$
test_vmap_mlp_speed_decorator[True-False] 0.8492ms 0.6885ms 1.4524 KOps/s 1.4529 KOps/s $\color{#d91a1a}-0.03\%$
test_vmap_mlp_speed_decorator[False-True] 0.8328ms 0.6285ms 1.5912 KOps/s 1.6762 KOps/s $\textbf{\color{#d91a1a}-5.07\%}$
test_vmap_mlp_speed_decorator[False-False] 0.8325ms 0.5981ms 1.6720 KOps/s 1.6743 KOps/s $\color{#d91a1a}-0.14\%$
test_vmap_transformer_speed_decorator[True-True] 19.8080ms 19.5417ms 51.1726 Ops/s 51.7207 Ops/s $\color{#d91a1a}-1.06\%$
test_vmap_transformer_speed_decorator[True-False] 19.8366ms 19.4821ms 51.3291 Ops/s 51.7110 Ops/s $\color{#d91a1a}-0.74\%$
test_vmap_transformer_speed_decorator[False-True] 19.6289ms 19.3804ms 51.5985 Ops/s 52.2714 Ops/s $\color{#d91a1a}-1.29\%$
test_vmap_transformer_speed_decorator[False-False] 19.6401ms 19.3562ms 51.6630 Ops/s 52.1624 Ops/s $\color{#d91a1a}-0.96\%$
test_to_module_speed[True] 1.4080ms 0.9560ms 1.0461 KOps/s 1.0317 KOps/s $\color{#35bf28}+1.39\%$
test_to_module_speed[False] 1.1092ms 0.9365ms 1.0678 KOps/s 1.0593 KOps/s $\color{#35bf28}+0.81\%$
test_tc_init 0.1561ms 31.4766μs 31.7696 KOps/s 26.7534 KOps/s $\textbf{\color{#35bf28}+18.75\%}$
test_tc_init_nested 0.1002ms 64.0815μs 15.6051 KOps/s 13.2112 KOps/s $\textbf{\color{#35bf28}+18.12\%}$
test_tc_first_layer_tensor 24.8310μs 0.7847μs 1.2744 MOps/s 1.2655 MOps/s $\color{#35bf28}+0.70\%$
test_tc_first_layer_nontensor 22.8310μs 2.1608μs 462.7901 KOps/s 459.5857 KOps/s $\color{#35bf28}+0.70\%$
test_tc_second_layer_tensor 11.4727μs 1.3835μs 722.8282 KOps/s 721.5766 KOps/s $\color{#35bf28}+0.17\%$
test_tc_second_layer_nontensor 23.0100μs 2.9056μs 344.1582 KOps/s 342.4606 KOps/s $\color{#35bf28}+0.50\%$
test_unbind 0.2220s 12.0883ms 82.7246 Ops/s 145.4389 Ops/s $\textbf{\color{#d91a1a}-43.12\%}$
test_full_like 9.4045ms 9.1229ms 109.6142 Ops/s 107.8749 Ops/s $\color{#35bf28}+1.61\%$
test_zeros_like 9.4749ms 7.2704ms 137.5439 Ops/s 137.5267 Ops/s $\color{#35bf28}+0.01\%$
test_ones_like 5.1427ms 4.3400ms 230.4170 Ops/s 230.4779 Ops/s $\color{#d91a1a}-0.03\%$
test_clone 11.5100ms 9.1545ms 109.2358 Ops/s 155.6360 Ops/s $\textbf{\color{#d91a1a}-29.81\%}$
test_squeeze 0.1120ms 9.6595μs 103.5255 KOps/s 105.2746 KOps/s $\color{#d91a1a}-1.66\%$
test_unsqueeze 0.2196ms 71.1541μs 14.0540 KOps/s 13.4974 KOps/s $\color{#35bf28}+4.12\%$
test_split 0.4278ms 0.1636ms 6.1118 KOps/s 6.2788 KOps/s $\color{#d91a1a}-2.66\%$
test_permute 0.3311ms 0.1755ms 5.6970 KOps/s 5.8506 KOps/s $\color{#d91a1a}-2.63\%$
test_stack 50.8114ms 50.5802ms 19.7706 Ops/s 19.7878 Ops/s $\color{#d91a1a}-0.09\%$
test_cat 50.8351ms 50.4086ms 19.8379 Ops/s 19.8632 Ops/s $\color{#d91a1a}-0.13\%$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants