Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] Fix serialization of stacks of Tensorclasses #1236

Merged
merged 3 commits into from
Feb 26, 2025

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Feb 25, 2025

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Feb 25, 2025
ghstack-source-id: 0f479c80655d1e663ce67a16031556dbe70937f9
Pull Request resolved: #1236
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 25, 2025
Copy link

github-actions bot commented Feb 25, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 217. Improved: $\large\color{#35bf28}7$. Worsened: $\large\color{#d91a1a}20$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 68.6270μs 21.1877μs 47.1972 KOps/s 47.4125 KOps/s $\color{#d91a1a}-0.45\%$
test_plain_set_stack_nested 66.6740μs 21.3683μs 46.7982 KOps/s 47.9190 KOps/s $\color{#d91a1a}-2.34\%$
test_plain_set_nested_inplace 65.3320μs 23.0791μs 43.3293 KOps/s 44.8219 KOps/s $\color{#d91a1a}-3.33\%$
test_plain_set_stack_nested_inplace 63.5280μs 22.7969μs 43.8656 KOps/s 44.9595 KOps/s $\color{#d91a1a}-2.43\%$
test_items 42.8000μs 4.2595μs 234.7670 KOps/s 232.0791 KOps/s $\color{#35bf28}+1.16\%$
test_items_nested 0.7136ms 0.4069ms 2.4577 KOps/s 2.5198 KOps/s $\color{#d91a1a}-2.46\%$
test_items_nested_locked 0.7271ms 0.4101ms 2.4385 KOps/s 2.4881 KOps/s $\color{#d91a1a}-2.00\%$
test_items_nested_leaf 0.1947ms 77.0058μs 12.9860 KOps/s 13.1824 KOps/s $\color{#d91a1a}-1.49\%$
test_items_stack_nested 0.7172ms 0.4048ms 2.4701 KOps/s 2.4753 KOps/s $\color{#d91a1a}-0.21\%$
test_items_stack_nested_leaf 0.1581ms 78.0396μs 12.8140 KOps/s 13.1799 KOps/s $\color{#d91a1a}-2.78\%$
test_items_stack_nested_locked 0.8176ms 0.4054ms 2.4667 KOps/s 2.4856 KOps/s $\color{#d91a1a}-0.76\%$
test_keys 18.1540μs 3.4630μs 288.7648 KOps/s 285.4286 KOps/s $\color{#35bf28}+1.17\%$
test_keys_nested 0.2689ms 0.1631ms 6.1312 KOps/s 6.0615 KOps/s $\color{#35bf28}+1.15\%$
test_keys_nested_locked 0.6743ms 0.1678ms 5.9583 KOps/s 5.8341 KOps/s $\color{#35bf28}+2.13\%$
test_keys_nested_leaf 0.2444ms 0.1422ms 7.0346 KOps/s 6.9259 KOps/s $\color{#35bf28}+1.57\%$
test_keys_stack_nested 0.2838ms 0.1632ms 6.1271 KOps/s 6.0592 KOps/s $\color{#35bf28}+1.12\%$
test_keys_stack_nested_leaf 0.2303ms 0.1417ms 7.0579 KOps/s 6.9596 KOps/s $\color{#35bf28}+1.41\%$
test_keys_stack_nested_locked 0.2980ms 0.1694ms 5.9048 KOps/s 5.8540 KOps/s $\color{#35bf28}+0.87\%$
test_values 10.4132μs 1.0525μs 950.1025 KOps/s 945.3966 KOps/s $\color{#35bf28}+0.50\%$
test_values_nested 0.1123ms 62.4829μs 16.0044 KOps/s 16.1875 KOps/s $\color{#d91a1a}-1.13\%$
test_values_nested_locked 0.1176ms 62.7452μs 15.9375 KOps/s 16.1262 KOps/s $\color{#d91a1a}-1.17\%$
test_values_nested_leaf 0.1366ms 71.8473μs 13.9184 KOps/s 14.0695 KOps/s $\color{#d91a1a}-1.07\%$
test_values_stack_nested 0.1170ms 63.2148μs 15.8191 KOps/s 16.1530 KOps/s $\color{#d91a1a}-2.07\%$
test_values_stack_nested_leaf 0.1280ms 71.4293μs 13.9998 KOps/s 14.0453 KOps/s $\color{#d91a1a}-0.32\%$
test_values_stack_nested_locked 0.1155ms 62.6686μs 15.9570 KOps/s 16.2534 KOps/s $\color{#d91a1a}-1.82\%$
test_membership 16.0700μs 0.8583μs 1.1651 MOps/s 1.1494 MOps/s $\color{#35bf28}+1.37\%$
test_membership_nested 43.3510μs 2.8789μs 347.3493 KOps/s 345.9851 KOps/s $\color{#35bf28}+0.39\%$
test_membership_nested_leaf 45.4440μs 2.9265μs 341.7081 KOps/s 343.8202 KOps/s $\color{#d91a1a}-0.61\%$
test_membership_stacked_nested 26.9300μs 2.8712μs 348.2806 KOps/s 346.5005 KOps/s $\color{#35bf28}+0.51\%$
test_membership_stacked_nested_leaf 18.2440μs 2.8896μs 346.0641 KOps/s 350.8263 KOps/s $\color{#d91a1a}-1.36\%$
test_membership_nested_last 46.4070μs 4.2842μs 233.4166 KOps/s 234.2487 KOps/s $\color{#d91a1a}-0.36\%$
test_membership_nested_leaf_last 27.2610μs 4.2949μs 232.8370 KOps/s 232.3410 KOps/s $\color{#35bf28}+0.21\%$
test_membership_stacked_nested_last 51.0650μs 4.2841μs 233.4221 KOps/s 234.6809 KOps/s $\color{#d91a1a}-0.54\%$
test_membership_stacked_nested_leaf_last 20.6880μs 4.2940μs 232.8839 KOps/s 230.2948 KOps/s $\color{#35bf28}+1.12\%$
test_nested_getleaf 54.5210μs 10.5303μs 94.9644 KOps/s 94.8979 KOps/s $\color{#35bf28}+0.07\%$
test_nested_get 57.4370μs 9.9926μs 100.0745 KOps/s 98.4952 KOps/s $\color{#35bf28}+1.60\%$
test_stacked_getleaf 31.8190μs 10.6265μs 94.1039 KOps/s 96.3279 KOps/s $\color{#d91a1a}-2.31\%$
test_stacked_get 52.7790μs 10.0184μs 99.8165 KOps/s 100.4058 KOps/s $\color{#d91a1a}-0.59\%$
test_nested_getitemleaf 32.6210μs 11.1541μs 89.6531 KOps/s 89.4379 KOps/s $\color{#35bf28}+0.24\%$
test_nested_getitem 55.7940μs 10.6071μs 94.2762 KOps/s 94.8007 KOps/s $\color{#d91a1a}-0.55\%$
test_stacked_getitemleaf 52.8780μs 11.1544μs 89.6511 KOps/s 90.1325 KOps/s $\color{#d91a1a}-0.53\%$
test_stacked_getitem 35.0650μs 10.6137μs 94.2181 KOps/s 95.2833 KOps/s $\color{#d91a1a}-1.12\%$
test_lock_nested 0.6004ms 0.4101ms 2.4384 KOps/s 2.4428 KOps/s $\color{#d91a1a}-0.18\%$
test_lock_stack_nested 0.5271ms 0.4217ms 2.3712 KOps/s 2.3532 KOps/s $\color{#35bf28}+0.76\%$
test_unlock_nested 0.4221ms 0.3313ms 3.0181 KOps/s 2.9852 KOps/s $\color{#35bf28}+1.10\%$
test_unlock_stack_nested 0.7135ms 0.3393ms 2.9469 KOps/s 2.9249 KOps/s $\color{#35bf28}+0.75\%$
test_flatten_speed 0.1899ms 0.1009ms 9.9077 KOps/s 9.9739 KOps/s $\color{#d91a1a}-0.66\%$
test_unflatten_speed 1.2497ms 0.5147ms 1.9429 KOps/s 1.9319 KOps/s $\color{#35bf28}+0.57\%$
test_common_ops 4.3503ms 0.8467ms 1.1810 KOps/s 1.2574 KOps/s $\textbf{\color{#d91a1a}-6.07\%}$
test_creation 48.2100μs 2.5248μs 396.0717 KOps/s 402.1641 KOps/s $\color{#d91a1a}-1.51\%$
test_creation_empty 72.4150μs 13.0993μs 76.3397 KOps/s 86.2605 KOps/s $\textbf{\color{#d91a1a}-11.50\%}$
test_creation_nested_1 50.2740μs 15.8611μs 63.0475 KOps/s 68.6295 KOps/s $\textbf{\color{#d91a1a}-8.13\%}$
test_creation_nested_2 43.9210μs 20.5275μs 48.7153 KOps/s 53.1394 KOps/s $\textbf{\color{#d91a1a}-8.33\%}$
test_clone 85.5790μs 13.3859μs 74.7055 KOps/s 71.6282 KOps/s $\color{#35bf28}+4.30\%$
test_getitem[int] 0.8662ms 12.9973μs 76.9391 KOps/s 79.4520 KOps/s $\color{#d91a1a}-3.16\%$
test_getitem[slice_int] 0.1276ms 24.8290μs 40.2754 KOps/s 41.8876 KOps/s $\color{#d91a1a}-3.85\%$
test_getitem[range] 0.1583ms 50.3413μs 19.8644 KOps/s 18.2768 KOps/s $\textbf{\color{#35bf28}+8.69\%}$
test_getitem[tuple] 0.1485ms 20.5637μs 48.6294 KOps/s 47.0832 KOps/s $\color{#35bf28}+3.28\%$
test_getitem[list] 0.1594ms 45.9831μs 21.7471 KOps/s 21.8449 KOps/s $\color{#d91a1a}-0.45\%$
test_setitem_dim[int] 51.0550μs 26.4152μs 37.8570 KOps/s 39.4709 KOps/s $\color{#d91a1a}-4.09\%$
test_setitem_dim[slice_int] 0.1003ms 51.5858μs 19.3852 KOps/s 19.7043 KOps/s $\color{#d91a1a}-1.62\%$
test_setitem_dim[range] 0.1287ms 76.6056μs 13.0539 KOps/s 13.1431 KOps/s $\color{#d91a1a}-0.68\%$
test_setitem_dim[tuple] 81.8520μs 41.9900μs 23.8152 KOps/s 24.9752 KOps/s $\color{#d91a1a}-4.64\%$
test_setitem 0.1034ms 21.5737μs 46.3527 KOps/s 49.4738 KOps/s $\textbf{\color{#d91a1a}-6.31\%}$
test_set 0.1013ms 21.0418μs 47.5245 KOps/s 48.9950 KOps/s $\color{#d91a1a}-3.00\%$
test_set_shared 4.2452ms 0.1868ms 5.3536 KOps/s 5.4822 KOps/s $\color{#d91a1a}-2.35\%$
test_update 0.2872ms 25.9915μs 38.4742 KOps/s 45.1076 KOps/s $\textbf{\color{#d91a1a}-14.71\%}$
test_update_nested 75.4400μs 36.1942μs 27.6288 KOps/s 30.2695 KOps/s $\textbf{\color{#d91a1a}-8.72\%}$
test_update__nested 0.4213ms 33.9868μs 29.4232 KOps/s 30.2636 KOps/s $\color{#d91a1a}-2.78\%$
test_set_nested 0.1216ms 23.3195μs 42.8826 KOps/s 45.8023 KOps/s $\textbf{\color{#d91a1a}-6.37\%}$
test_set_nested_new 71.3920μs 27.3353μs 36.5828 KOps/s 37.9313 KOps/s $\color{#d91a1a}-3.56\%$
test_select 0.1100ms 43.3238μs 23.0820 KOps/s 23.4942 KOps/s $\color{#d91a1a}-1.75\%$
test_select_nested 0.1175ms 61.8263μs 16.1743 KOps/s 16.0707 KOps/s $\color{#35bf28}+0.65\%$
test_exclude_nested 0.1671ms 79.4677μs 12.5837 KOps/s 12.3459 KOps/s $\color{#35bf28}+1.93\%$
test_empty[True] 0.5682ms 0.4023ms 2.4859 KOps/s 2.4468 KOps/s $\color{#35bf28}+1.60\%$
test_empty[False] 35.6437μs 1.3771μs 726.1894 KOps/s 728.3850 KOps/s $\color{#d91a1a}-0.30\%$
test_unbind_speed 0.5399ms 0.2673ms 3.7406 KOps/s 3.6634 KOps/s $\color{#35bf28}+2.11\%$
test_unbind_speed_stack0 0.4083ms 0.2658ms 3.7623 KOps/s 3.7130 KOps/s $\color{#35bf28}+1.33\%$
test_unbind_speed_stack1 0.1001s 0.7214ms 1.3861 KOps/s 1.1924 KOps/s $\textbf{\color{#35bf28}+16.25\%}$
test_split 0.1055s 1.7776ms 562.5594 Ops/s 559.5183 Ops/s $\color{#35bf28}+0.54\%$
test_chunk 0.1047s 1.7724ms 564.2086 Ops/s 629.9324 Ops/s $\textbf{\color{#d91a1a}-10.43\%}$
test_consolidate_njt[False-None] 8.8017ms 8.4427ms 118.4449 Ops/s 108.8788 Ops/s $\textbf{\color{#35bf28}+8.79\%}$
test_creation[device0] 0.2216ms 91.6641μs 10.9094 KOps/s 10.7933 KOps/s $\color{#35bf28}+1.08\%$
test_creation_from_tensor 4.2526ms 95.4257μs 10.4794 KOps/s 10.7543 KOps/s $\color{#d91a1a}-2.56\%$
test_add_one[memmap_tensor0] 0.1097ms 4.8984μs 204.1478 KOps/s 203.9570 KOps/s $\color{#35bf28}+0.09\%$
test_contiguous[memmap_tensor0] 15.1680μs 0.5147μs 1.9429 MOps/s 1.9556 MOps/s $\color{#d91a1a}-0.65\%$
test_stack[memmap_tensor0] 26.9500μs 3.3687μs 296.8499 KOps/s 289.6538 KOps/s $\color{#35bf28}+2.48\%$
test_memmaptd_index 1.2587ms 0.2280ms 4.3861 KOps/s 4.3516 KOps/s $\color{#35bf28}+0.79\%$
test_memmaptd_index_astensor 0.4926ms 0.3140ms 3.1851 KOps/s 3.1626 KOps/s $\color{#35bf28}+0.71\%$
test_memmaptd_index_op 0.8673ms 0.6060ms 1.6502 KOps/s 1.7293 KOps/s $\color{#d91a1a}-4.57\%$
test_serialize_model 0.2180s 0.1332s 7.5060 Ops/s 8.7201 Ops/s $\textbf{\color{#d91a1a}-13.92\%}$
test_serialize_model_pickle 0.4917s 0.4002s 2.4988 Ops/s 2.5143 Ops/s $\color{#d91a1a}-0.62\%$
test_serialize_weights 0.1284s 0.1175s 8.5088 Ops/s 8.6989 Ops/s $\color{#d91a1a}-2.19\%$
test_serialize_weights_returnearly 0.1980s 0.1637s 6.1084 Ops/s 5.5944 Ops/s $\textbf{\color{#35bf28}+9.19\%}$
test_serialize_weights_pickle 0.5401s 0.4429s 2.2577 Ops/s 2.4510 Ops/s $\textbf{\color{#d91a1a}-7.89\%}$
test_serialize_weights_filesystem 0.1635s 0.1458s 6.8577 Ops/s 6.7876 Ops/s $\color{#35bf28}+1.03\%$
test_serialize_model_filesystem 0.1561s 0.1494s 6.6932 Ops/s 6.5300 Ops/s $\color{#35bf28}+2.50\%$
test_reshape_pytree 61.5240μs 26.0554μs 38.3798 KOps/s 37.9137 KOps/s $\color{#35bf28}+1.23\%$
test_reshape_td 73.3570μs 33.2443μs 30.0803 KOps/s 30.3319 KOps/s $\color{#d91a1a}-0.83\%$
test_view_pytree 81.6720μs 26.2261μs 38.1300 KOps/s 36.5874 KOps/s $\color{#35bf28}+4.22\%$
test_view_td 0.1047ms 41.2923μs 24.2176 KOps/s 22.2917 KOps/s $\textbf{\color{#35bf28}+8.64\%}$
test_unbind_pytree 68.1660μs 29.3181μs 34.1086 KOps/s 33.8922 KOps/s $\color{#35bf28}+0.64\%$
test_unbind_td 0.2990ms 39.5918μs 25.2578 KOps/s 25.3619 KOps/s $\color{#d91a1a}-0.41\%$
test_split_pytree 0.1033ms 29.3451μs 34.0772 KOps/s 34.3473 KOps/s $\color{#d91a1a}-0.79\%$
test_split_td 0.5066ms 45.1749μs 22.1362 KOps/s 21.9190 KOps/s $\color{#35bf28}+0.99\%$
test_add_pytree 0.2763ms 35.3067μs 28.3233 KOps/s 28.0579 KOps/s $\color{#35bf28}+0.95\%$
test_add_td 0.1692ms 59.2697μs 16.8720 KOps/s 18.0738 KOps/s $\textbf{\color{#d91a1a}-6.65\%}$
test_compile_add_one_nested[tensordict-compile] 0.1939ms 67.6286μs 14.7866 KOps/s 15.2231 KOps/s $\color{#d91a1a}-2.87\%$
test_compile_add_one_nested[tensordict-eager] 1.2952ms 0.1723ms 5.8028 KOps/s 5.9028 KOps/s $\color{#d91a1a}-1.69\%$
test_compile_add_one_nested[pytree-compile] 0.2030ms 47.0969μs 21.2328 KOps/s 22.1178 KOps/s $\color{#d91a1a}-4.00\%$
test_compile_add_one_nested[pytree-eager] 0.2433ms 0.1169ms 8.5562 KOps/s 8.3979 KOps/s $\color{#35bf28}+1.89\%$
test_compile_copy_nested[tensordict-compile] 0.1042ms 28.8206μs 34.6974 KOps/s 36.3074 KOps/s $\color{#d91a1a}-4.43\%$
test_compile_copy_nested[tensordict-eager] 0.1217ms 58.2831μs 17.1576 KOps/s 16.5841 KOps/s $\color{#35bf28}+3.46\%$
test_compile_copy_nested[pytree-compile] 0.1652ms 77.5121μs 12.9012 KOps/s 12.5391 KOps/s $\color{#35bf28}+2.89\%$
test_compile_copy_nested[pytree-eager] 0.1226ms 65.3761μs 15.2961 KOps/s 15.0415 KOps/s $\color{#35bf28}+1.69\%$
test_compile_add_one_flat[tensordict-compile] 0.2032ms 0.1069ms 9.3565 KOps/s 9.5089 KOps/s $\color{#d91a1a}-1.60\%$
test_compile_add_one_flat[tensordict-eager] 0.4030ms 0.2160ms 4.6296 KOps/s 4.6363 KOps/s $\color{#d91a1a}-0.14\%$
test_compile_add_one_flat[tensorclass-compile] 0.1963ms 48.0967μs 20.7915 KOps/s 21.5145 KOps/s $\color{#d91a1a}-3.36\%$
test_compile_add_one_flat[tensorclass-eager] 0.1424ms 66.4191μs 15.0559 KOps/s 14.9921 KOps/s $\color{#35bf28}+0.43\%$
test_compile_add_one_flat[pytree-compile] 0.2164ms 99.7040μs 10.0297 KOps/s 10.0162 KOps/s $\color{#35bf28}+0.14\%$
test_compile_add_one_flat[pytree-eager] 0.4672ms 0.2024ms 4.9397 KOps/s 4.9553 KOps/s $\color{#d91a1a}-0.31\%$
test_compile_add_self_flat[tensordict-eager] 0.4952ms 0.2314ms 4.3210 KOps/s 4.3425 KOps/s $\color{#d91a1a}-0.50\%$
test_compile_add_self_flat[tensordict-compile] 0.2334ms 0.1112ms 8.9902 KOps/s 9.5904 KOps/s $\textbf{\color{#d91a1a}-6.26\%}$
test_compile_add_self_flat[tensorclass-eager] 0.2609ms 63.9969μs 15.6257 KOps/s 16.2594 KOps/s $\color{#d91a1a}-3.90\%$
test_compile_add_self_flat[tensorclass-compile] 0.3170ms 48.7509μs 20.5125 KOps/s 21.3754 KOps/s $\color{#d91a1a}-4.04\%$
test_compile_add_self_flat[pytree-eager] 0.2507ms 0.1573ms 6.3566 KOps/s 6.3883 KOps/s $\color{#d91a1a}-0.50\%$
test_compile_add_self_flat[pytree-compile] 0.2461ms 0.1014ms 9.8609 KOps/s 10.1082 KOps/s $\color{#d91a1a}-2.45\%$
test_compile_copy_flat[tensordict-compile] 0.1021ms 21.0833μs 47.4309 KOps/s 48.0320 KOps/s $\color{#d91a1a}-1.25\%$
test_compile_copy_flat[tensordict-eager] 0.1577ms 66.8219μs 14.9652 KOps/s 15.2502 KOps/s $\color{#d91a1a}-1.87\%$
test_compile_copy_flat[pytree-compile] 0.1593ms 81.2883μs 12.3019 KOps/s 12.4960 KOps/s $\color{#d91a1a}-1.55\%$
test_compile_copy_flat[pytree-eager] 0.1540ms 67.1775μs 14.8859 KOps/s 15.0698 KOps/s $\color{#d91a1a}-1.22\%$
test_compile_assign_and_add[tensordict-compile] 0.2867ms 0.2151ms 4.6491 KOps/s 4.7377 KOps/s $\color{#d91a1a}-1.87\%$
test_compile_assign_and_add[tensordict-eager] 1.7632ms 1.3818ms 723.6811 Ops/s 722.0411 Ops/s $\color{#35bf28}+0.23\%$
test_compile_assign_and_add[pytree-compile] 0.3277ms 0.2108ms 4.7445 KOps/s 4.9025 KOps/s $\color{#d91a1a}-3.22\%$
test_compile_assign_and_add[pytree-eager] 0.9170ms 0.8261ms 1.2104 KOps/s 1.2112 KOps/s $\color{#d91a1a}-0.06\%$
test_compile_assign_and_add_stack[compile] 0.5842ms 0.4586ms 2.1805 KOps/s 2.2314 KOps/s $\color{#d91a1a}-2.28\%$
test_compile_assign_and_add_stack[eager] 3.5769ms 2.8235ms 354.1700 Ops/s 370.8073 Ops/s $\color{#d91a1a}-4.49\%$
test_compile_indexing[tensor-tensordict-compile] 0.1152ms 40.3059μs 24.8103 KOps/s 26.3024 KOps/s $\textbf{\color{#d91a1a}-5.67\%}$
test_compile_indexing[tensor-tensordict-eager] 0.5659ms 33.7511μs 29.6286 KOps/s 30.1424 KOps/s $\color{#d91a1a}-1.70\%$
test_compile_indexing[tensor-tensorclass-compile] 76.8130μs 31.5524μs 31.6933 KOps/s 32.7168 KOps/s $\color{#d91a1a}-3.13\%$
test_compile_indexing[tensor-tensorclass-eager] 0.1181ms 22.5860μs 44.2752 KOps/s 43.2656 KOps/s $\color{#35bf28}+2.33\%$
test_compile_indexing[tensor-pytree-compile] 0.1154ms 32.6349μs 30.6421 KOps/s 31.7593 KOps/s $\color{#d91a1a}-3.52\%$
test_compile_indexing[tensor-pytree-eager] 63.2280μs 22.1554μs 45.1356 KOps/s 43.1624 KOps/s $\color{#35bf28}+4.57\%$
test_compile_indexing[slice-tensordict-compile] 0.1114ms 53.0138μs 18.8630 KOps/s 18.6741 KOps/s $\color{#35bf28}+1.01\%$
test_compile_indexing[slice-tensordict-eager] 0.4613ms 20.0634μs 49.8420 KOps/s 49.2065 KOps/s $\color{#35bf28}+1.29\%$
test_compile_indexing[slice-tensorclass-compile] 95.1270μs 45.7950μs 21.8365 KOps/s 22.5420 KOps/s $\color{#d91a1a}-3.13\%$
test_compile_indexing[slice-tensorclass-eager] 94.5490μs 18.9913μs 52.6558 KOps/s 53.4527 KOps/s $\color{#d91a1a}-1.49\%$
test_compile_indexing[slice-pytree-compile] 0.1076ms 46.7625μs 21.3847 KOps/s 22.0683 KOps/s $\color{#d91a1a}-3.10\%$
test_compile_indexing[slice-pytree-eager] 83.8960μs 18.2671μs 54.7433 KOps/s 53.6742 KOps/s $\color{#35bf28}+1.99\%$
test_compile_indexing[int-tensordict-compile] 0.1324ms 54.6673μs 18.2925 KOps/s 18.8128 KOps/s $\color{#d91a1a}-2.77\%$
test_compile_indexing[int-tensordict-eager] 0.8534ms 19.8560μs 50.3626 KOps/s 50.6036 KOps/s $\color{#d91a1a}-0.48\%$
test_compile_indexing[int-tensorclass-compile] 0.1192ms 47.2124μs 21.1809 KOps/s 22.0102 KOps/s $\color{#d91a1a}-3.77\%$
test_compile_indexing[int-tensorclass-eager] 73.8170μs 18.5568μs 53.8886 KOps/s 53.4697 KOps/s $\color{#35bf28}+0.78\%$
test_compile_indexing[int-pytree-compile] 0.1169ms 47.2682μs 21.1559 KOps/s 21.9257 KOps/s $\color{#d91a1a}-3.51\%$
test_compile_indexing[int-pytree-eager] 79.0770μs 18.1244μs 55.1743 KOps/s 53.7966 KOps/s $\color{#35bf28}+2.56\%$
test_mod_add[eager] 90.1880μs 37.4838μs 26.6782 KOps/s 28.2482 KOps/s $\textbf{\color{#d91a1a}-5.56\%}$
test_mod_add[compile] 0.1343ms 67.2020μs 14.8805 KOps/s 15.6907 KOps/s $\textbf{\color{#d91a1a}-5.16\%}$
test_mod_add[compile-overhead] 0.1225ms 63.2403μs 15.8127 KOps/s 15.8378 KOps/s $\color{#d91a1a}-0.16\%$
test_mod_wrap[eager] 0.4618ms 0.2248ms 4.4481 KOps/s 4.5261 KOps/s $\color{#d91a1a}-1.72\%$
test_mod_wrap[compile] 2.0559ms 0.2319ms 4.3124 KOps/s 4.4440 KOps/s $\color{#d91a1a}-2.96\%$
test_mod_wrap[compile-overhead] 0.3531ms 0.2264ms 4.4174 KOps/s 4.5451 KOps/s $\color{#d91a1a}-2.81\%$
test_mod_wrap_and_backward[eager] 16.7989ms 13.1256ms 76.1872 Ops/s 77.5695 Ops/s $\color{#d91a1a}-1.78\%$
test_mod_wrap_and_backward[compile] 13.4150ms 11.3727ms 87.9298 Ops/s 88.5616 Ops/s $\color{#d91a1a}-0.71\%$
test_mod_wrap_and_backward[compile-overhead] 23.5749ms 11.5792ms 86.3618 Ops/s 87.2851 Ops/s $\color{#d91a1a}-1.06\%$
test_seq_add[eager] 0.2161ms 0.1220ms 8.1979 KOps/s 8.2828 KOps/s $\color{#d91a1a}-1.02\%$
test_seq_add[compile] 0.1578ms 79.5699μs 12.5676 KOps/s 13.2035 KOps/s $\color{#d91a1a}-4.82\%$
test_seq_add[compile-overhead] 0.1375ms 78.0119μs 12.8186 KOps/s 13.4170 KOps/s $\color{#d91a1a}-4.46\%$
test_seq_wrap[eager] 0.7208ms 0.4484ms 2.2301 KOps/s 2.2284 KOps/s $\color{#35bf28}+0.08\%$
test_seq_wrap[compile] 0.8672ms 0.2471ms 4.0476 KOps/s 4.2309 KOps/s $\color{#d91a1a}-4.33\%$
test_seq_wrap[compile-overhead] 0.3958ms 0.2437ms 4.1038 KOps/s 4.2260 KOps/s $\color{#d91a1a}-2.89\%$
test_func_call_runtime[False-eager] 0.9624ms 0.5333ms 1.8750 KOps/s 1.8190 KOps/s $\color{#35bf28}+3.07\%$
test_func_call_runtime[False-compile] 0.8576ms 0.4461ms 2.2415 KOps/s 2.2973 KOps/s $\color{#d91a1a}-2.43\%$
test_func_call_runtime[False-compile-overhead] 0.8431ms 0.4454ms 2.2451 KOps/s 2.3153 KOps/s $\color{#d91a1a}-3.03\%$
test_func_call_runtime[True-eager] 0.8583ms 0.7457ms 1.3410 KOps/s 1.3364 KOps/s $\color{#35bf28}+0.35\%$
test_func_call_runtime[True-compile] 0.7884ms 0.4666ms 2.1433 KOps/s 2.2272 KOps/s $\color{#d91a1a}-3.77\%$
test_func_call_runtime[True-compile-overhead] 0.7212ms 0.4666ms 2.1432 KOps/s 2.2284 KOps/s $\color{#d91a1a}-3.83\%$
test_func_call_cm_runtime[False-eager] 0.8586ms 0.5275ms 1.8958 KOps/s 1.8600 KOps/s $\color{#35bf28}+1.93\%$
test_func_call_cm_runtime[False-compile] 0.5655ms 0.4417ms 2.2639 KOps/s 2.3147 KOps/s $\color{#d91a1a}-2.19\%$
test_func_call_cm_runtime[False-compile-overhead] 0.7470ms 0.4448ms 2.2482 KOps/s 2.3240 KOps/s $\color{#d91a1a}-3.26\%$
test_func_call_cm_runtime[True-eager] 1.4369ms 0.8966ms 1.1154 KOps/s 1.1146 KOps/s $\color{#35bf28}+0.07\%$
test_func_call_cm_runtime[True-compile] 1.2765ms 0.7944ms 1.2587 KOps/s 1.2538 KOps/s $\color{#35bf28}+0.40\%$
test_func_call_cm_runtime[True-compile-overhead] 1.1562ms 0.7937ms 1.2600 KOps/s 1.2425 KOps/s $\color{#35bf28}+1.41\%$
test_vmap_func_call_cm_runtime[eager] 2.6785ms 1.8930ms 528.2569 Ops/s 525.0149 Ops/s $\color{#35bf28}+0.62\%$
test_vmap_func_call_cm_runtime[compile] 0.6714ms 0.5446ms 1.8363 KOps/s 1.8703 KOps/s $\color{#d91a1a}-1.82\%$
test_vmap_func_call_cm_runtime[compile-overhead] 1.0250ms 0.5424ms 1.8437 KOps/s 1.8497 KOps/s $\color{#d91a1a}-0.33\%$
test_distributed 0.2519ms 0.1237ms 8.0861 KOps/s 7.8273 KOps/s $\color{#35bf28}+3.31\%$
test_tdmodule 65.8830μs 27.9351μs 35.7973 KOps/s 36.5724 KOps/s $\color{#d91a1a}-2.12\%$
test_tdmodule_dispatch 98.2130μs 51.0001μs 19.6078 KOps/s 20.3048 KOps/s $\color{#d91a1a}-3.43\%$
test_tdseq 46.8980μs 28.8126μs 34.7071 KOps/s 34.9633 KOps/s $\color{#d91a1a}-0.73\%$
test_tdseq_dispatch 90.4090μs 55.3411μs 18.0697 KOps/s 18.3816 KOps/s $\color{#d91a1a}-1.70\%$
test_instantiation_functorch 1.7533ms 1.5169ms 659.2431 Ops/s 660.2769 Ops/s $\color{#d91a1a}-0.16\%$
test_exec_functorch 0.3873ms 0.1775ms 5.6337 KOps/s 5.6425 KOps/s $\color{#d91a1a}-0.16\%$
test_exec_functional_call 0.4193ms 0.1701ms 5.8774 KOps/s 5.9593 KOps/s $\color{#d91a1a}-1.38\%$
test_exec_td_decorator 0.5216ms 0.2319ms 4.3117 KOps/s 4.1910 KOps/s $\color{#35bf28}+2.88\%$
test_vmap_mlp_speed_decorator[True-True] 0.9217ms 0.6575ms 1.5209 KOps/s 1.5271 KOps/s $\color{#d91a1a}-0.40\%$
test_vmap_mlp_speed_decorator[True-False] 1.0037ms 0.6666ms 1.5001 KOps/s 1.5321 KOps/s $\color{#d91a1a}-2.09\%$
test_vmap_mlp_speed_decorator[False-True] 0.7525ms 0.5265ms 1.8992 KOps/s 1.8471 KOps/s $\color{#35bf28}+2.82\%$
test_vmap_mlp_speed_decorator[False-False] 0.7769ms 0.5293ms 1.8892 KOps/s 1.9017 KOps/s $\color{#d91a1a}-0.65\%$
test_to_module_speed[True] 2.1472ms 1.3232ms 755.7208 Ops/s 757.5219 Ops/s $\color{#d91a1a}-0.24\%$
test_to_module_speed[False] 1.4540ms 1.2847ms 778.4065 Ops/s 769.2371 Ops/s $\color{#35bf28}+1.19\%$
test_tc_init 0.1013ms 48.4820μs 20.6262 KOps/s 22.1269 KOps/s $\textbf{\color{#d91a1a}-6.78\%}$
test_tc_init_nested 0.2215ms 96.3698μs 10.3767 KOps/s 11.0897 KOps/s $\textbf{\color{#d91a1a}-6.43\%}$
test_tc_first_layer_tensor 18.8250μs 1.5457μs 646.9390 KOps/s 632.3299 KOps/s $\color{#35bf28}+2.31\%$
test_tc_first_layer_nontensor 26.2980μs 4.6956μs 212.9650 KOps/s 215.6281 KOps/s $\color{#d91a1a}-1.24\%$
test_tc_second_layer_tensor 28.1920μs 2.9301μs 341.2892 KOps/s 343.7539 KOps/s $\color{#d91a1a}-0.72\%$
test_tc_second_layer_nontensor 27.1900μs 6.0048μs 166.5328 KOps/s 166.2796 KOps/s $\color{#35bf28}+0.15\%$
test_unbind 0.2415s 13.3193ms 75.0792 Ops/s 69.6191 Ops/s $\textbf{\color{#35bf28}+7.84\%}$
test_full_like 10.0623ms 9.1869ms 108.8508 Ops/s 130.0277 Ops/s $\textbf{\color{#d91a1a}-16.29\%}$
test_zeros_like 5.4620ms 2.8719ms 348.1979 Ops/s 328.6395 Ops/s $\textbf{\color{#35bf28}+5.95\%}$
test_ones_like 6.3716ms 3.5266ms 283.5588 Ops/s 277.0852 Ops/s $\color{#35bf28}+2.34\%$
test_clone 8.7912ms 7.0195ms 142.4597 Ops/s 178.4710 Ops/s $\textbf{\color{#d91a1a}-20.18\%}$
test_squeeze 60.3630μs 12.7242μs 78.5902 KOps/s 77.1991 KOps/s $\color{#35bf28}+1.80\%$
test_unsqueeze 0.3020ms 96.2662μs 10.3879 KOps/s 10.6901 KOps/s $\color{#d91a1a}-2.83\%$
test_split 0.3536ms 0.1984ms 5.0412 KOps/s 5.1385 KOps/s $\color{#d91a1a}-1.89\%$
test_permute 0.3256ms 0.2030ms 4.9271 KOps/s 4.9457 KOps/s $\color{#d91a1a}-0.37\%$
test_stack 33.5030ms 25.1353ms 39.7847 Ops/s 38.3561 Ops/s $\color{#35bf28}+3.72\%$
test_cat 25.6903ms 25.1244ms 39.8019 Ops/s 38.7463 Ops/s $\color{#35bf28}+2.72\%$

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Feb 25, 2025
ghstack-source-id: ea71af4f2eb5813bc5b25ed595edda0cf4fa1438
Pull Request resolved: #1236
Copy link

github-actions bot commented Feb 25, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 229. Improved: $\large\color{#35bf28}13$. Worsened: $\large\color{#d91a1a}42$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 34.3310μs 13.0562μs 76.5917 KOps/s 82.2422 KOps/s $\textbf{\color{#d91a1a}-6.87\%}$
test_plain_set_stack_nested 46.9210μs 13.2046μs 75.7311 KOps/s 81.8239 KOps/s $\textbf{\color{#d91a1a}-7.45\%}$
test_plain_set_nested_inplace 45.5710μs 14.2935μs 69.9618 KOps/s 75.1244 KOps/s $\textbf{\color{#d91a1a}-6.87\%}$
test_plain_set_stack_nested_inplace 43.0810μs 14.1508μs 70.6675 KOps/s 76.2230 KOps/s $\textbf{\color{#d91a1a}-7.29\%}$
test_items 29.7310μs 2.8711μs 348.3036 KOps/s 341.3312 KOps/s $\color{#35bf28}+2.04\%$
test_items_nested 0.4136ms 0.3640ms 2.7471 KOps/s 2.7415 KOps/s $\color{#35bf28}+0.20\%$
test_items_nested_locked 0.4106ms 0.3641ms 2.7465 KOps/s 2.7342 KOps/s $\color{#35bf28}+0.45\%$
test_items_nested_leaf 82.0420μs 60.1865μs 16.6150 KOps/s 16.5860 KOps/s $\color{#35bf28}+0.18\%$
test_items_stack_nested 0.4175ms 0.3608ms 2.7713 KOps/s 2.8062 KOps/s $\color{#d91a1a}-1.24\%$
test_items_stack_nested_leaf 83.7420μs 59.9731μs 16.6742 KOps/s 16.5464 KOps/s $\color{#35bf28}+0.77\%$
test_items_stack_nested_locked 0.4009ms 0.3630ms 2.7550 KOps/s 2.7567 KOps/s $\color{#d91a1a}-0.06\%$
test_keys 28.1500μs 3.3980μs 294.2924 KOps/s 292.9632 KOps/s $\color{#35bf28}+0.45\%$
test_keys_nested 0.1145ms 88.5165μs 11.2973 KOps/s 11.2489 KOps/s $\color{#35bf28}+0.43\%$
test_keys_nested_locked 0.7802ms 93.9334μs 10.6458 KOps/s 10.6213 KOps/s $\color{#35bf28}+0.23\%$
test_keys_nested_leaf 0.1114ms 79.5337μs 12.5733 KOps/s 12.5669 KOps/s $\color{#35bf28}+0.05\%$
test_keys_stack_nested 0.1222ms 87.9784μs 11.3664 KOps/s 11.3551 KOps/s $\color{#35bf28}+0.10\%$
test_keys_stack_nested_leaf 0.1664ms 78.9582μs 12.6649 KOps/s 12.7020 KOps/s $\color{#d91a1a}-0.29\%$
test_keys_stack_nested_locked 0.1201ms 93.3872μs 10.7081 KOps/s 10.5933 KOps/s $\color{#35bf28}+1.08\%$
test_values 5.3550μs 0.8519μs 1.1738 MOps/s 1.1696 MOps/s $\color{#35bf28}+0.36\%$
test_values_nested 62.8810μs 37.0518μs 26.9892 KOps/s 27.0121 KOps/s $\color{#d91a1a}-0.08\%$
test_values_nested_locked 66.3120μs 39.0041μs 25.6383 KOps/s 25.4905 KOps/s $\color{#35bf28}+0.58\%$
test_values_nested_leaf 66.8210μs 42.1176μs 23.7430 KOps/s 23.4027 KOps/s $\color{#35bf28}+1.45\%$
test_values_stack_nested 0.1312ms 37.0032μs 27.0247 KOps/s 26.7640 KOps/s $\color{#35bf28}+0.97\%$
test_values_stack_nested_leaf 86.0720μs 41.9533μs 23.8360 KOps/s 23.4783 KOps/s $\color{#35bf28}+1.52\%$
test_values_stack_nested_locked 63.8810μs 38.7440μs 25.8105 KOps/s 25.6085 KOps/s $\color{#35bf28}+0.79\%$
test_membership 4.3111μs 0.4986μs 2.0058 MOps/s 1.9871 MOps/s $\color{#35bf28}+0.94\%$
test_membership_nested 15.8405μs 1.9572μs 510.9226 KOps/s 515.5777 KOps/s $\color{#d91a1a}-0.90\%$
test_membership_nested_leaf 19.9405μs 1.9644μs 509.0641 KOps/s 514.0159 KOps/s $\color{#d91a1a}-0.96\%$
test_membership_stacked_nested 28.2510μs 2.0366μs 491.0203 KOps/s 489.1062 KOps/s $\color{#35bf28}+0.39\%$
test_membership_stacked_nested_leaf 26.6510μs 2.0277μs 493.1680 KOps/s 498.3859 KOps/s $\color{#d91a1a}-1.05\%$
test_membership_nested_last 30.9210μs 3.0908μs 323.5423 KOps/s 334.1193 KOps/s $\color{#d91a1a}-3.17\%$
test_membership_nested_leaf_last 29.4810μs 3.0513μs 327.7325 KOps/s 336.2933 KOps/s $\color{#d91a1a}-2.55\%$
test_membership_stacked_nested_last 35.2410μs 2.9951μs 333.8763 KOps/s 330.6845 KOps/s $\color{#35bf28}+0.97\%$
test_membership_stacked_nested_leaf_last 24.0400μs 2.9711μs 336.5728 KOps/s 338.8160 KOps/s $\color{#d91a1a}-0.66\%$
test_nested_getleaf 30.8500μs 6.2098μs 161.0347 KOps/s 160.8394 KOps/s $\color{#35bf28}+0.12\%$
test_nested_get 26.4700μs 5.9407μs 168.3293 KOps/s 169.3171 KOps/s $\color{#d91a1a}-0.58\%$
test_stacked_getleaf 34.9110μs 6.1109μs 163.6422 KOps/s 163.7771 KOps/s $\color{#d91a1a}-0.08\%$
test_stacked_get 31.9110μs 5.7964μs 172.5195 KOps/s 173.7930 KOps/s $\color{#d91a1a}-0.73\%$
test_nested_getitemleaf 43.9010μs 6.4195μs 155.7764 KOps/s 154.3585 KOps/s $\color{#35bf28}+0.92\%$
test_nested_getitem 34.9900μs 6.0807μs 164.4549 KOps/s 163.6712 KOps/s $\color{#35bf28}+0.48\%$
test_stacked_getitemleaf 55.0010μs 6.2938μs 158.8874 KOps/s 156.7052 KOps/s $\color{#35bf28}+1.39\%$
test_stacked_getitem 35.2610μs 5.9920μs 166.8903 KOps/s 166.4132 KOps/s $\color{#35bf28}+0.29\%$
test_lock_nested 2.0289ms 0.3381ms 2.9576 KOps/s 3.0154 KOps/s $\color{#d91a1a}-1.92\%$
test_lock_stack_nested 0.3820ms 0.3407ms 2.9352 KOps/s 2.9437 KOps/s $\color{#d91a1a}-0.29\%$
test_unlock_nested 0.3499ms 0.2814ms 3.5535 KOps/s 3.6552 KOps/s $\color{#d91a1a}-2.78\%$
test_unlock_stack_nested 0.3415ms 0.2795ms 3.5782 KOps/s 3.6326 KOps/s $\color{#d91a1a}-1.50\%$
test_flatten_speed 0.1160ms 76.9065μs 13.0028 KOps/s 13.0102 KOps/s $\color{#d91a1a}-0.06\%$
test_unflatten_speed 0.3847ms 0.3191ms 3.1335 KOps/s 3.1521 KOps/s $\color{#d91a1a}-0.59\%$
test_common_ops 0.8062ms 0.6349ms 1.5751 KOps/s 1.7143 KOps/s $\textbf{\color{#d91a1a}-8.12\%}$
test_creation 0.1391ms 1.6975μs 589.0920 KOps/s 586.7819 KOps/s $\color{#35bf28}+0.39\%$
test_creation_empty 32.1200μs 9.6446μs 103.6852 KOps/s 132.8862 KOps/s $\textbf{\color{#d91a1a}-21.97\%}$
test_creation_nested_1 41.9610μs 11.3829μs 87.8513 KOps/s 110.5945 KOps/s $\textbf{\color{#d91a1a}-20.56\%}$
test_creation_nested_2 40.2010μs 13.9880μs 71.4896 KOps/s 84.2459 KOps/s $\textbf{\color{#d91a1a}-15.14\%}$
test_clone 56.7310μs 10.9061μs 91.6918 KOps/s 93.3755 KOps/s $\color{#d91a1a}-1.80\%$
test_getitem[int] 1.2683ms 10.5391μs 94.8848 KOps/s 94.9325 KOps/s $\color{#d91a1a}-0.05\%$
test_getitem[slice_int] 0.1194ms 20.6078μs 48.5253 KOps/s 49.6242 KOps/s $\color{#d91a1a}-2.21\%$
test_getitem[range] 0.1281ms 35.9271μs 27.8341 KOps/s 28.1202 KOps/s $\color{#d91a1a}-1.02\%$
test_getitem[tuple] 0.1113ms 17.7557μs 56.3199 KOps/s 56.3065 KOps/s $\color{#35bf28}+0.02\%$
test_getitem[list] 0.1256ms 32.0520μs 31.1993 KOps/s 31.5324 KOps/s $\color{#d91a1a}-1.06\%$
test_setitem_dim[int] 38.7900μs 19.1385μs 52.2506 KOps/s 53.0483 KOps/s $\color{#d91a1a}-1.50\%$
test_setitem_dim[slice_int] 57.8910μs 36.6035μs 27.3198 KOps/s 26.8280 KOps/s $\color{#35bf28}+1.83\%$
test_setitem_dim[range] 73.6110μs 50.7867μs 19.6902 KOps/s 19.4834 KOps/s $\color{#35bf28}+1.06\%$
test_setitem_dim[tuple] 53.9710μs 31.9656μs 31.2837 KOps/s 31.7877 KOps/s $\color{#d91a1a}-1.59\%$
test_setitem 69.8120μs 16.1835μs 61.7915 KOps/s 68.5281 KOps/s $\textbf{\color{#d91a1a}-9.83\%}$
test_set 70.6110μs 15.4804μs 64.5976 KOps/s 71.5481 KOps/s $\textbf{\color{#d91a1a}-9.71\%}$
test_set_shared 0.5099ms 0.1570ms 6.3695 KOps/s 6.4408 KOps/s $\color{#d91a1a}-1.11\%$
test_update 0.2337ms 19.5079μs 51.2612 KOps/s 59.5093 KOps/s $\textbf{\color{#d91a1a}-13.86\%}$
test_update_nested 79.9120μs 25.6475μs 38.9901 KOps/s 45.2673 KOps/s $\textbf{\color{#d91a1a}-13.87\%}$
test_update__nested 0.4973ms 24.9689μs 40.0498 KOps/s 39.6712 KOps/s $\color{#35bf28}+0.95\%$
test_set_nested 71.6810μs 17.1776μs 58.2154 KOps/s 64.2209 KOps/s $\textbf{\color{#d91a1a}-9.35\%}$
test_set_nested_new 78.8920μs 19.3755μs 51.6117 KOps/s 55.7376 KOps/s $\textbf{\color{#d91a1a}-7.40\%}$
test_select 90.7120μs 31.5851μs 31.6605 KOps/s 34.7791 KOps/s $\textbf{\color{#d91a1a}-8.97\%}$
test_select_nested 72.1510μs 43.4839μs 22.9970 KOps/s 22.8528 KOps/s $\color{#35bf28}+0.63\%$
test_exclude_nested 92.0320μs 61.5167μs 16.2558 KOps/s 16.4224 KOps/s $\color{#d91a1a}-1.01\%$
test_empty[True] 0.4003ms 0.2915ms 3.4309 KOps/s 3.4357 KOps/s $\color{#d91a1a}-0.14\%$
test_empty[False] 4.1220μs 0.8192μs 1.2207 MOps/s 1.2053 MOps/s $\color{#35bf28}+1.27\%$
test_to 88.0510μs 55.7634μs 17.9329 KOps/s 17.7252 KOps/s $\color{#35bf28}+1.17\%$
test_to_nonblocking 95.1720μs 47.0498μs 21.2541 KOps/s 21.1172 KOps/s $\color{#35bf28}+0.65\%$
test_unbind_speed 0.2745ms 0.2394ms 4.1770 KOps/s 4.2636 KOps/s $\color{#d91a1a}-2.03\%$
test_unbind_speed_stack0 0.2832ms 0.2341ms 4.2711 KOps/s 4.3028 KOps/s $\color{#d91a1a}-0.74\%$
test_unbind_speed_stack1 92.3942ms 0.7341ms 1.3622 KOps/s 1.3652 KOps/s $\color{#d91a1a}-0.22\%$
test_split 94.8370ms 1.5754ms 634.7607 Ops/s 642.3276 Ops/s $\color{#d91a1a}-1.18\%$
test_chunk 94.2265ms 1.5734ms 635.5625 Ops/s 638.8252 Ops/s $\color{#d91a1a}-0.51\%$
test_consolidate[False-None] 96.3946ms 2.9234ms 342.0616 Ops/s 344.7759 Ops/s $\color{#d91a1a}-0.79\%$
test_consolidate[default-None] 1.7658ms 1.6923ms 590.9120 Ops/s 607.8961 Ops/s $\color{#d91a1a}-2.79\%$
test_consolidate[reduce-overhead-None] 1.7539ms 1.6963ms 589.5296 Ops/s 589.6191 Ops/s $\color{#d91a1a}-0.02\%$
test_consolidate_njt[False-None] 6.7002ms 6.3927ms 156.4295 Ops/s 156.2226 Ops/s $\color{#35bf28}+0.13\%$
test_to[False-False-None] 1.8348ms 1.7459ms 572.7753 Ops/s 582.5018 Ops/s $\color{#d91a1a}-1.67\%$
test_to[True-False-None] 1.5569ms 1.3294ms 752.1968 Ops/s 774.7796 Ops/s $\color{#d91a1a}-2.91\%$
test_to[within-False-None] 4.3408ms 4.1261ms 242.3574 Ops/s 247.5707 Ops/s $\color{#d91a1a}-2.11\%$
test_to[True-default-None] 5.7207ms 5.2966ms 188.7991 Ops/s 193.2427 Ops/s $\color{#d91a1a}-2.30\%$
test_to_njt[False-False-None] 6.9073ms 6.7709ms 147.6906 Ops/s 142.1001 Ops/s $\color{#35bf28}+3.93\%$
test_to_njt[True-False-None] 5.5416ms 5.3551ms 186.7367 Ops/s 177.3835 Ops/s $\textbf{\color{#35bf28}+5.27\%}$
test_to_njt[within-False-None] 12.0294ms 11.5749ms 86.3936 Ops/s 83.9690 Ops/s $\color{#35bf28}+2.89\%$
test_creation[device0] 0.5483ms 82.6365μs 12.1012 KOps/s 12.1267 KOps/s $\color{#d91a1a}-0.21\%$
test_creation_from_tensor 0.5075ms 84.7327μs 11.8018 KOps/s 11.5977 KOps/s $\color{#35bf28}+1.76\%$
test_add_one[memmap_tensor0] 0.4423ms 6.7959μs 147.1475 KOps/s 150.1190 KOps/s $\color{#d91a1a}-1.98\%$
test_contiguous[memmap_tensor0] 2.2291μs 0.4634μs 2.1578 MOps/s 2.4155 MOps/s $\textbf{\color{#d91a1a}-10.67\%}$
test_stack[memmap_tensor0] 43.3610μs 4.2608μs 234.6952 KOps/s 236.5954 KOps/s $\color{#d91a1a}-0.80\%$
test_memmaptd_index 1.5224ms 0.2402ms 4.1636 KOps/s 4.2226 KOps/s $\color{#d91a1a}-1.40\%$
test_memmaptd_index_astensor 0.4317ms 0.3027ms 3.3032 KOps/s 3.3801 KOps/s $\color{#d91a1a}-2.27\%$
test_memmaptd_index_op 0.7168ms 0.5865ms 1.7050 KOps/s 1.8139 KOps/s $\textbf{\color{#d91a1a}-6.00\%}$
test_serialize_model 0.1315s 0.1307s 7.6489 Ops/s 7.6291 Ops/s $\color{#35bf28}+0.26\%$
test_serialize_model_pickle 1.3508s 1.2113s 0.8256 Ops/s 0.8441 Ops/s $\color{#d91a1a}-2.19\%$
test_serialize_weights 0.2818s 0.1516s 6.5953 Ops/s 7.6659 Ops/s $\textbf{\color{#d91a1a}-13.97\%}$
test_serialize_weights_returnearly 0.3313s 53.4598ms 18.7056 Ops/s 11.7420 Ops/s $\textbf{\color{#35bf28}+59.30\%}$
test_serialize_weights_pickle 1.3834s 1.1916s 0.8392 Ops/s 0.8219 Ops/s $\color{#35bf28}+2.11\%$
test_reshape_pytree 68.9110μs 23.3210μs 42.8797 KOps/s 45.8770 KOps/s $\textbf{\color{#d91a1a}-6.53\%}$
test_reshape_td 62.3310μs 29.5097μs 33.8872 KOps/s 38.1605 KOps/s $\textbf{\color{#d91a1a}-11.20\%}$
test_view_pytree 57.2510μs 22.9984μs 43.4813 KOps/s 46.9702 KOps/s $\textbf{\color{#d91a1a}-7.43\%}$
test_view_td 69.9210μs 34.8409μs 28.7019 KOps/s 30.5299 KOps/s $\textbf{\color{#d91a1a}-5.99\%}$
test_unbind_pytree 60.4510μs 29.3571μs 34.0634 KOps/s 36.1095 KOps/s $\textbf{\color{#d91a1a}-5.67\%}$
test_unbind_td 0.6892ms 37.2687μs 26.8321 KOps/s 27.1219 KOps/s $\color{#d91a1a}-1.07\%$
test_split_pytree 64.9910μs 32.1113μs 31.1416 KOps/s 33.6919 KOps/s $\textbf{\color{#d91a1a}-7.57\%}$
test_split_td 0.8228ms 37.2050μs 26.8781 KOps/s 25.7887 KOps/s $\color{#35bf28}+4.22\%$
test_add_pytree 73.0420μs 34.7075μs 28.8122 KOps/s 29.6139 KOps/s $\color{#d91a1a}-2.71\%$
test_add_td 0.1918ms 53.6110μs 18.6529 KOps/s 21.3055 KOps/s $\textbf{\color{#d91a1a}-12.45\%}$
test_compile_add_one_nested[tensordict-compile] 0.1703ms 0.1187ms 8.4254 KOps/s 8.5080 KOps/s $\color{#d91a1a}-0.97\%$
test_compile_add_one_nested[tensordict-eager] 0.2225ms 0.1320ms 7.5754 KOps/s 7.5613 KOps/s $\color{#35bf28}+0.19\%$
test_compile_add_one_nested[pytree-compile] 0.1520ms 92.2934μs 10.8350 KOps/s 10.8608 KOps/s $\color{#d91a1a}-0.24\%$
test_compile_add_one_nested[pytree-eager] 0.3263ms 0.1460ms 6.8486 KOps/s 6.7671 KOps/s $\color{#35bf28}+1.20\%$
test_compile_copy_nested[tensordict-compile] 69.3610μs 31.9321μs 31.3165 KOps/s 34.7168 KOps/s $\textbf{\color{#d91a1a}-9.79\%}$
test_compile_copy_nested[tensordict-eager] 69.8320μs 29.2330μs 34.2079 KOps/s 34.1776 KOps/s $\color{#35bf28}+0.09\%$
test_compile_copy_nested[pytree-compile] 0.4484ms 62.6460μs 15.9627 KOps/s 15.7978 KOps/s $\color{#35bf28}+1.04\%$
test_compile_copy_nested[pytree-eager] 94.5720μs 48.2943μs 20.7064 KOps/s 20.4458 KOps/s $\color{#35bf28}+1.27\%$
test_compile_add_one_flat[tensordict-compile] 0.1757ms 0.1369ms 7.3069 KOps/s 7.2967 KOps/s $\color{#35bf28}+0.14\%$
test_compile_add_one_flat[tensordict-eager] 0.3071ms 0.2108ms 4.7427 KOps/s 4.6746 KOps/s $\color{#35bf28}+1.46\%$
test_compile_add_one_flat[tensorclass-compile] 0.1442ms 98.2315μs 10.1800 KOps/s 10.6656 KOps/s $\color{#d91a1a}-4.55\%$
test_compile_add_one_flat[tensorclass-eager] 0.1185ms 55.5520μs 18.0011 KOps/s 18.5298 KOps/s $\color{#d91a1a}-2.85\%$
test_compile_add_one_flat[pytree-compile] 0.1737ms 0.1313ms 7.6171 KOps/s 7.5662 KOps/s $\color{#35bf28}+0.67\%$
test_compile_add_one_flat[pytree-eager] 0.5273ms 0.4745ms 2.1077 KOps/s 2.1031 KOps/s $\color{#35bf28}+0.22\%$
test_compile_add_self_flat[tensordict-eager] 0.3987ms 0.2582ms 3.8732 KOps/s 3.8748 KOps/s $\color{#d91a1a}-0.04\%$
test_compile_add_self_flat[tensordict-compile] 0.1855ms 0.1419ms 7.0494 KOps/s 7.2448 KOps/s $\color{#d91a1a}-2.70\%$
test_compile_add_self_flat[tensorclass-eager] 0.1652ms 68.9438μs 14.5046 KOps/s 15.1002 KOps/s $\color{#d91a1a}-3.94\%$
test_compile_add_self_flat[tensorclass-compile] 0.1402ms 99.7239μs 10.0277 KOps/s 10.5395 KOps/s $\color{#d91a1a}-4.86\%$
test_compile_add_self_flat[pytree-eager] 0.4503ms 0.4014ms 2.4912 KOps/s 2.4656 KOps/s $\color{#35bf28}+1.04\%$
test_compile_add_self_flat[pytree-compile] 0.1683ms 0.1313ms 7.6133 KOps/s 7.5945 KOps/s $\color{#35bf28}+0.25\%$
test_compile_copy_flat[tensordict-compile] 47.2510μs 18.5092μs 54.0273 KOps/s 39.4582 KOps/s $\textbf{\color{#35bf28}+36.92\%}$
test_compile_copy_flat[tensordict-eager] 59.5110μs 30.9793μs 32.2796 KOps/s 31.4080 KOps/s $\color{#35bf28}+2.78\%$
test_compile_copy_flat[pytree-compile] 0.1038ms 68.5864μs 14.5801 KOps/s 14.7650 KOps/s $\color{#d91a1a}-1.25\%$
test_compile_copy_flat[pytree-eager] 85.8710μs 51.7739μs 19.3148 KOps/s 19.4673 KOps/s $\color{#d91a1a}-0.78\%$
test_compile_assign_and_add[tensordict-compile] 1.5674ms 0.3816ms 2.6202 KOps/s 2.2026 KOps/s $\textbf{\color{#35bf28}+18.96\%}$
test_compile_assign_and_add[tensordict-eager] 2.7875ms 2.6284ms 380.4652 Ops/s 369.4972 Ops/s $\color{#35bf28}+2.97\%$
test_compile_assign_and_add[pytree-compile] 1.5609ms 0.3753ms 2.6646 KOps/s 2.3458 KOps/s $\textbf{\color{#35bf28}+13.59\%}$
test_compile_assign_and_add[pytree-eager] 2.9763ms 2.7269ms 366.7214 Ops/s 383.4566 Ops/s $\color{#d91a1a}-4.36\%$
test_compile_indexing[tensor-tensordict-compile] 0.1895ms 0.1183ms 8.4547 KOps/s 9.0064 KOps/s $\textbf{\color{#d91a1a}-6.13\%}$
test_compile_indexing[tensor-tensordict-eager] 0.5606ms 83.8706μs 11.9231 KOps/s 12.0134 KOps/s $\color{#d91a1a}-0.75\%$
test_compile_indexing[tensor-tensorclass-compile] 0.1814ms 0.1156ms 8.6529 KOps/s 9.6852 KOps/s $\textbf{\color{#d91a1a}-10.66\%}$
test_compile_indexing[tensor-tensorclass-eager] 0.1247ms 72.0712μs 13.8752 KOps/s 14.8284 KOps/s $\textbf{\color{#d91a1a}-6.43\%}$
test_compile_indexing[tensor-pytree-compile] 0.1632ms 0.1141ms 8.7671 KOps/s 9.6096 KOps/s $\textbf{\color{#d91a1a}-8.77\%}$
test_compile_indexing[tensor-pytree-eager] 0.1160ms 71.6943μs 13.9481 KOps/s 14.9899 KOps/s $\textbf{\color{#d91a1a}-6.95\%}$
test_compile_indexing[slice-tensordict-compile] 0.1427ms 97.7247μs 10.2328 KOps/s 10.2430 KOps/s $\color{#d91a1a}-0.10\%$
test_compile_indexing[slice-tensordict-eager] 0.1581ms 17.0170μs 58.7647 KOps/s 58.5538 KOps/s $\color{#35bf28}+0.36\%$
test_compile_indexing[slice-tensorclass-compile] 0.1397ms 93.9723μs 10.6414 KOps/s 10.6980 KOps/s $\color{#d91a1a}-0.53\%$
test_compile_indexing[slice-tensorclass-eager] 0.1174ms 19.7042μs 50.7507 KOps/s 65.2124 KOps/s $\textbf{\color{#d91a1a}-22.18\%}$
test_compile_indexing[slice-pytree-compile] 0.1559ms 97.4108μs 10.2658 KOps/s 10.6556 KOps/s $\color{#d91a1a}-3.66\%$
test_compile_indexing[slice-pytree-eager] 53.2710μs 16.3605μs 61.1230 KOps/s 65.0199 KOps/s $\textbf{\color{#d91a1a}-5.99\%}$
test_compile_indexing[int-tensordict-compile] 0.1533ms 0.1052ms 9.5043 KOps/s 9.8452 KOps/s $\color{#d91a1a}-3.46\%$
test_compile_indexing[int-tensordict-eager] 0.5854ms 17.9651μs 55.6635 KOps/s 60.1399 KOps/s $\textbf{\color{#d91a1a}-7.44\%}$
test_compile_indexing[int-tensorclass-compile] 0.1519ms 97.8329μs 10.2215 KOps/s 10.6591 KOps/s $\color{#d91a1a}-4.11\%$
test_compile_indexing[int-tensorclass-eager] 52.4810μs 16.4781μs 60.6868 KOps/s 64.7656 KOps/s $\textbf{\color{#d91a1a}-6.30\%}$
test_compile_indexing[int-pytree-compile] 0.1574ms 99.1330μs 10.0875 KOps/s 10.6717 KOps/s $\textbf{\color{#d91a1a}-5.47\%}$
test_compile_indexing[int-pytree-eager] 50.7510μs 16.0335μs 62.3696 KOps/s 64.9530 KOps/s $\color{#d91a1a}-3.98\%$
test_mod_add[eager] 80.3410μs 38.8662μs 25.7293 KOps/s 26.6322 KOps/s $\color{#d91a1a}-3.39\%$
test_mod_add[compile] 0.1240ms 78.8060μs 12.6894 KOps/s 12.5547 KOps/s $\color{#35bf28}+1.07\%$
test_mod_add[compile-overhead] 0.3139ms 0.1635ms 6.1168 KOps/s 5.7059 KOps/s $\textbf{\color{#35bf28}+7.20\%}$
test_mod_wrap[eager] 0.3205ms 0.2430ms 4.1158 KOps/s 3.8509 KOps/s $\textbf{\color{#35bf28}+6.88\%}$
test_mod_wrap[compile] 0.3263ms 0.2785ms 3.5908 KOps/s 3.3647 KOps/s $\textbf{\color{#35bf28}+6.72\%}$
test_mod_wrap[compile-overhead] 7.4311ms 3.8903ms 257.0501 Ops/s 262.1870 Ops/s $\color{#d91a1a}-1.96\%$
test_mod_wrap_and_backward[eager] 1.5473ms 1.4289ms 699.8575 Ops/s 684.8577 Ops/s $\color{#35bf28}+2.19\%$
test_mod_wrap_and_backward[compile] 1.4425ms 1.3386ms 747.0351 Ops/s 737.9138 Ops/s $\color{#35bf28}+1.24\%$
test_mod_wrap_and_backward[compile-overhead] 1.4893ms 1.0128ms 987.3971 Ops/s 964.9581 Ops/s $\color{#35bf28}+2.33\%$
test_seq_add[eager] 0.1752ms 0.1165ms 8.5872 KOps/s 8.4315 KOps/s $\color{#35bf28}+1.85\%$
test_seq_add[compile] 0.1460ms 87.5439μs 11.4228 KOps/s 10.7457 KOps/s $\textbf{\color{#35bf28}+6.30\%}$
test_seq_add[compile-overhead] 0.1838ms 0.1278ms 7.8274 KOps/s 7.6610 KOps/s $\color{#35bf28}+2.17\%$
test_seq_wrap[eager] 0.4839ms 0.4203ms 2.3794 KOps/s 2.2776 KOps/s $\color{#35bf28}+4.47\%$
test_seq_wrap[compile] 0.3611ms 0.2967ms 3.3701 KOps/s 3.1571 KOps/s $\textbf{\color{#35bf28}+6.75\%}$
test_seq_wrap[compile-overhead] 0.2807ms 0.2281ms 4.3846 KOps/s 4.3722 KOps/s $\color{#35bf28}+0.28\%$
test_func_call_runtime[False-eager] 0.8448ms 0.7550ms 1.3245 KOps/s 1.2584 KOps/s $\textbf{\color{#35bf28}+5.25\%}$
test_func_call_runtime[False-compile] 0.9176ms 0.7278ms 1.3740 KOps/s 1.3593 KOps/s $\color{#35bf28}+1.08\%$
test_func_call_runtime[False-compile-overhead] 0.4090ms 0.3560ms 2.8093 KOps/s 2.8177 KOps/s $\color{#d91a1a}-0.30\%$
test_func_call_runtime[True-eager] 0.9931ms 0.8850ms 1.1300 KOps/s 1.1005 KOps/s $\color{#35bf28}+2.68\%$
test_func_call_runtime[True-compile] 0.8713ms 0.7868ms 1.2710 KOps/s 1.3216 KOps/s $\color{#d91a1a}-3.82\%$
test_func_call_runtime[True-compile-overhead] 0.4424ms 0.3745ms 2.6703 KOps/s 2.6910 KOps/s $\color{#d91a1a}-0.77\%$
test_func_call_cm_runtime[False-eager] 0.7654ms 0.7193ms 1.3902 KOps/s 1.3650 KOps/s $\color{#35bf28}+1.85\%$
test_func_call_cm_runtime[False-compile] 0.8468ms 0.7411ms 1.3493 KOps/s 1.3580 KOps/s $\color{#d91a1a}-0.64\%$
test_func_call_cm_runtime[False-compile-overhead] 0.4407ms 0.3562ms 2.8072 KOps/s 2.8273 KOps/s $\color{#d91a1a}-0.71\%$
test_func_call_cm_runtime[True-eager] 1.0604ms 0.9852ms 1.0150 KOps/s 993.9234 Ops/s $\color{#35bf28}+2.12\%$
test_func_call_cm_runtime[True-compile] 1.0960ms 0.9755ms 1.0251 KOps/s 1.0114 KOps/s $\color{#35bf28}+1.36\%$
test_func_call_cm_runtime[True-compile-overhead] 1.0956ms 0.9716ms 1.0292 KOps/s 1.0045 KOps/s $\color{#35bf28}+2.46\%$
test_vmap_func_call_cm_runtime[eager] 2.4944ms 2.0726ms 482.4938 Ops/s 470.8302 Ops/s $\color{#35bf28}+2.48\%$
test_vmap_func_call_cm_runtime[compile] 0.8980ms 0.7983ms 1.2527 KOps/s 1.2624 KOps/s $\color{#d91a1a}-0.77\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.5548ms 0.4085ms 2.4480 KOps/s 2.4542 KOps/s $\color{#d91a1a}-0.26\%$
test_distributed 2.8786ms 0.1872ms 5.3427 KOps/s 8.2695 KOps/s $\textbf{\color{#d91a1a}-35.39\%}$
test_tdmodule 30.9610μs 21.3521μs 46.8338 KOps/s 49.5152 KOps/s $\textbf{\color{#d91a1a}-5.42\%}$
test_tdmodule_dispatch 59.9610μs 37.5030μs 26.6645 KOps/s 28.1643 KOps/s $\textbf{\color{#d91a1a}-5.33\%}$
test_tdseq 41.0510μs 21.1444μs 47.2939 KOps/s 49.2647 KOps/s $\color{#d91a1a}-4.00\%$
test_tdseq_dispatch 70.9610μs 40.2655μs 24.8352 KOps/s 26.3837 KOps/s $\textbf{\color{#d91a1a}-5.87\%}$
test_instantiation_functorch 1.6541ms 1.5304ms 653.4300 Ops/s 658.6646 Ops/s $\color{#d91a1a}-0.79\%$
test_exec_functorch 0.1923ms 0.1426ms 7.0103 KOps/s 7.0900 KOps/s $\color{#d91a1a}-1.12\%$
test_exec_functional_call 0.2137ms 0.1358ms 7.3625 KOps/s 7.4880 KOps/s $\color{#d91a1a}-1.68\%$
test_exec_td_decorator 0.3752ms 0.1867ms 5.3553 KOps/s 5.4679 KOps/s $\color{#d91a1a}-2.06\%$
test_vmap_mlp_speed_decorator[True-True] 0.7476ms 0.6818ms 1.4667 KOps/s 1.4677 KOps/s $\color{#d91a1a}-0.07\%$
test_vmap_mlp_speed_decorator[True-False] 0.8162ms 0.6811ms 1.4683 KOps/s 1.4726 KOps/s $\color{#d91a1a}-0.29\%$
test_vmap_mlp_speed_decorator[False-True] 0.7322ms 0.5883ms 1.6998 KOps/s 1.6920 KOps/s $\color{#35bf28}+0.46\%$
test_vmap_mlp_speed_decorator[False-False] 0.7053ms 0.5906ms 1.6932 KOps/s 1.6894 KOps/s $\color{#35bf28}+0.23\%$
test_vmap_transformer_speed_decorator[True-True] 19.1390ms 19.0740ms 52.4273 Ops/s 52.1690 Ops/s $\color{#35bf28}+0.50\%$
test_vmap_transformer_speed_decorator[True-False] 19.9766ms 19.1738ms 52.1544 Ops/s 52.1292 Ops/s $\color{#35bf28}+0.05\%$
test_vmap_transformer_speed_decorator[False-True] 19.0351ms 18.9221ms 52.8483 Ops/s 52.4749 Ops/s $\color{#35bf28}+0.71\%$
test_vmap_transformer_speed_decorator[False-False] 19.7953ms 18.9652ms 52.7281 Ops/s 52.5432 Ops/s $\color{#35bf28}+0.35\%$
test_to_module_speed[True] 1.0605ms 0.9646ms 1.0367 KOps/s 1.0457 KOps/s $\color{#d91a1a}-0.86\%$
test_to_module_speed[False] 1.2834ms 0.9551ms 1.0470 KOps/s 1.0591 KOps/s $\color{#d91a1a}-1.15\%$
test_tc_init 62.1910μs 35.7775μs 27.9505 KOps/s 28.1726 KOps/s $\color{#d91a1a}-0.79\%$
test_tc_init_nested 0.1563ms 72.2190μs 13.8468 KOps/s 14.3370 KOps/s $\color{#d91a1a}-3.42\%$
test_tc_first_layer_tensor 3.8643μs 0.7071μs 1.4142 MOps/s 1.2606 MOps/s $\textbf{\color{#35bf28}+12.19\%}$
test_tc_first_layer_nontensor 39.5700μs 2.2217μs 450.1067 KOps/s 450.6666 KOps/s $\color{#d91a1a}-0.12\%$
test_tc_second_layer_tensor 18.8753μs 1.4151μs 706.6870 KOps/s 706.4613 KOps/s $\color{#35bf28}+0.03\%$
test_tc_second_layer_nontensor 0.2034ms 2.9569μs 338.1888 KOps/s 339.4571 KOps/s $\color{#d91a1a}-0.37\%$
test_unbind 0.2180s 10.0211ms 99.7894 Ops/s 145.6810 Ops/s $\textbf{\color{#d91a1a}-31.50\%}$
test_full_like 10.1893ms 9.0884ms 110.0308 Ops/s 108.7642 Ops/s $\color{#35bf28}+1.16\%$
test_zeros_like 11.5338ms 8.5979ms 116.3079 Ops/s 234.6017 Ops/s $\textbf{\color{#d91a1a}-50.42\%}$
test_ones_like 5.1121ms 4.3269ms 231.1121 Ops/s 235.2688 Ops/s $\color{#d91a1a}-1.77\%$
test_clone 6.6580ms 6.3468ms 157.5607 Ops/s 110.0591 Ops/s $\textbf{\color{#35bf28}+43.16\%}$
test_squeeze 58.0920μs 9.9025μs 100.9844 KOps/s 105.6169 KOps/s $\color{#d91a1a}-4.39\%$
test_unsqueeze 0.1691ms 72.2314μs 13.8444 KOps/s 13.5176 KOps/s $\color{#35bf28}+2.42\%$
test_split 0.3738ms 0.1573ms 6.3584 KOps/s 6.2375 KOps/s $\color{#35bf28}+1.94\%$
test_permute 0.2324ms 0.1832ms 5.4596 KOps/s 5.4333 KOps/s $\color{#35bf28}+0.48\%$
test_stack 50.2610ms 50.0133ms 19.9947 Ops/s 19.9679 Ops/s $\color{#35bf28}+0.13\%$
test_cat 50.4678ms 49.9448ms 20.0221 Ops/s 20.0147 Ops/s $\color{#35bf28}+0.04\%$

[ghstack-poisoned]
@vmoens vmoens merged commit 5091e16 into gh/vmoens/48/base Feb 26, 2025
44 of 53 checks passed
vmoens added a commit that referenced this pull request Feb 26, 2025
ghstack-source-id: 8e47f46e83982d554237604f6ef7c845eeed1b50
Pull Request resolved: #1236
@vmoens vmoens deleted the gh/vmoens/48/head branch February 26, 2025 11:03
vmoens added a commit that referenced this pull request Feb 26, 2025
ghstack-source-id: 8e47f46e83982d554237604f6ef7c845eeed1b50
Pull Request resolved: #1236

(cherry picked from commit 635c9c0)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants