Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] Fix non-deterministic key order in stack #1230

Merged
merged 2 commits into from
Feb 24, 2025

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Feb 22, 2025

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Feb 22, 2025
ghstack-source-id: 7f394789b783d6359a78a300aaf449eb25adb5e3
Pull Request resolved: #1230
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 22, 2025
Copy link

github-actions bot commented Feb 22, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 217. Improved: $\large\color{#35bf28}23$. Worsened: $\large\color{#d91a1a}3$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 43.2110μs 20.4436μs 48.9150 KOps/s 48.7299 KOps/s $\color{#35bf28}+0.38\%$
test_plain_set_stack_nested 50.6950μs 20.8489μs 47.9642 KOps/s 47.9784 KOps/s $\color{#d91a1a}-0.03\%$
test_plain_set_nested_inplace 0.1050ms 22.2425μs 44.9590 KOps/s 44.6018 KOps/s $\color{#35bf28}+0.80\%$
test_plain_set_stack_nested_inplace 0.1024ms 22.1717μs 45.1026 KOps/s 44.0466 KOps/s $\color{#35bf28}+2.40\%$
test_items 18.6050μs 4.0844μs 244.8367 KOps/s 243.1780 KOps/s $\color{#35bf28}+0.68\%$
test_items_nested 0.6702ms 0.4055ms 2.4660 KOps/s 2.4693 KOps/s $\color{#d91a1a}-0.13\%$
test_items_nested_locked 0.6818ms 0.4027ms 2.4832 KOps/s 2.4654 KOps/s $\color{#35bf28}+0.72\%$
test_items_nested_leaf 0.1840ms 75.1847μs 13.3006 KOps/s 13.0728 KOps/s $\color{#35bf28}+1.74\%$
test_items_stack_nested 0.7141ms 0.4055ms 2.4662 KOps/s 2.4575 KOps/s $\color{#35bf28}+0.35\%$
test_items_stack_nested_leaf 0.1354ms 76.3399μs 13.0993 KOps/s 12.7170 KOps/s $\color{#35bf28}+3.01\%$
test_items_stack_nested_locked 0.5948ms 0.4030ms 2.4812 KOps/s 2.4415 KOps/s $\color{#35bf28}+1.62\%$
test_keys 40.4360μs 3.4818μs 287.2101 KOps/s 290.9562 KOps/s $\color{#d91a1a}-1.29\%$
test_keys_nested 0.3000ms 0.1635ms 6.1147 KOps/s 5.9699 KOps/s $\color{#35bf28}+2.42\%$
test_keys_nested_locked 1.8654ms 0.1711ms 5.8455 KOps/s 5.8387 KOps/s $\color{#35bf28}+0.12\%$
test_keys_nested_leaf 0.2666ms 0.1426ms 7.0137 KOps/s 6.7610 KOps/s $\color{#35bf28}+3.74\%$
test_keys_stack_nested 0.2633ms 0.1620ms 6.1741 KOps/s 6.1400 KOps/s $\color{#35bf28}+0.56\%$
test_keys_stack_nested_leaf 0.2572ms 0.1429ms 6.9999 KOps/s 7.0285 KOps/s $\color{#d91a1a}-0.41\%$
test_keys_stack_nested_locked 0.2537ms 0.1693ms 5.9066 KOps/s 5.8518 KOps/s $\color{#35bf28}+0.94\%$
test_values 8.7042μs 1.0465μs 955.5870 KOps/s 980.5448 KOps/s $\color{#d91a1a}-2.55\%$
test_values_nested 0.1368ms 62.2929μs 16.0532 KOps/s 15.8145 KOps/s $\color{#35bf28}+1.51\%$
test_values_nested_locked 0.1396ms 64.0850μs 15.6043 KOps/s 15.9154 KOps/s $\color{#d91a1a}-1.95\%$
test_values_nested_leaf 0.1084ms 70.8036μs 14.1236 KOps/s 13.9612 KOps/s $\color{#35bf28}+1.16\%$
test_values_stack_nested 0.1075ms 61.8358μs 16.1719 KOps/s 15.8101 KOps/s $\color{#35bf28}+2.29\%$
test_values_stack_nested_leaf 0.1343ms 71.7092μs 13.9452 KOps/s 13.6844 KOps/s $\color{#35bf28}+1.91\%$
test_values_stack_nested_locked 0.1175ms 61.8034μs 16.1803 KOps/s 14.2689 KOps/s $\textbf{\color{#35bf28}+13.40\%}$
test_membership 13.4860μs 0.8556μs 1.1688 MOps/s 1.1658 MOps/s $\color{#35bf28}+0.25\%$
test_membership_nested 0.1267ms 2.9733μs 336.3279 KOps/s 347.6711 KOps/s $\color{#d91a1a}-3.26\%$
test_membership_nested_leaf 43.0100μs 2.8839μs 346.7516 KOps/s 345.1824 KOps/s $\color{#35bf28}+0.45\%$
test_membership_stacked_nested 20.2370μs 2.8350μs 352.7368 KOps/s 350.6887 KOps/s $\color{#35bf28}+0.58\%$
test_membership_stacked_nested_leaf 23.7350μs 2.8508μs 350.7807 KOps/s 350.4104 KOps/s $\color{#35bf28}+0.11\%$
test_membership_nested_last 40.3360μs 4.2915μs 233.0208 KOps/s 230.2049 KOps/s $\color{#35bf28}+1.22\%$
test_membership_nested_leaf_last 53.6390μs 4.3530μs 229.7269 KOps/s 227.0836 KOps/s $\color{#35bf28}+1.16\%$
test_membership_stacked_nested_last 65.3580μs 4.2370μs 236.0181 KOps/s 163.0227 KOps/s $\textbf{\color{#35bf28}+44.78\%}$
test_membership_stacked_nested_leaf_last 26.0480μs 4.2760μs 233.8611 KOps/s 163.6846 KOps/s $\textbf{\color{#35bf28}+42.87\%}$
test_nested_getleaf 0.1174ms 10.6863μs 93.5776 KOps/s 96.1400 KOps/s $\color{#d91a1a}-2.67\%$
test_nested_get 67.1050μs 10.0320μs 99.6808 KOps/s 100.1385 KOps/s $\color{#d91a1a}-0.46\%$
test_stacked_getleaf 55.9140μs 10.4238μs 95.9342 KOps/s 95.0175 KOps/s $\color{#35bf28}+0.96\%$
test_stacked_get 57.4580μs 9.9430μs 100.5733 KOps/s 101.4031 KOps/s $\color{#d91a1a}-0.82\%$
test_nested_getitemleaf 49.0510μs 11.0856μs 90.2073 KOps/s 89.4288 KOps/s $\color{#35bf28}+0.87\%$
test_nested_getitem 53.0480μs 10.7146μs 93.3306 KOps/s 93.2182 KOps/s $\color{#35bf28}+0.12\%$
test_stacked_getitemleaf 76.4720μs 11.0340μs 90.6287 KOps/s 89.6607 KOps/s $\color{#35bf28}+1.08\%$
test_stacked_getitem 48.7910μs 10.5963μs 94.3725 KOps/s 93.2167 KOps/s $\color{#35bf28}+1.24\%$
test_lock_nested 0.6604ms 0.4121ms 2.4268 KOps/s 2.3660 KOps/s $\color{#35bf28}+2.57\%$
test_lock_stack_nested 0.8698ms 0.4277ms 2.3382 KOps/s 2.3118 KOps/s $\color{#35bf28}+1.14\%$
test_unlock_nested 0.5307ms 0.3404ms 2.9375 KOps/s 2.8297 KOps/s $\color{#35bf28}+3.81\%$
test_unlock_stack_nested 0.6680ms 0.3457ms 2.8929 KOps/s 2.8521 KOps/s $\color{#35bf28}+1.43\%$
test_flatten_speed 0.2052ms 99.1298μs 10.0878 KOps/s 10.0768 KOps/s $\color{#35bf28}+0.11\%$
test_unflatten_speed 0.8703ms 0.5147ms 1.9427 KOps/s 1.9216 KOps/s $\color{#35bf28}+1.10\%$
test_common_ops 1.0171ms 0.8069ms 1.2393 KOps/s 1.2062 KOps/s $\color{#35bf28}+2.75\%$
test_creation 24.5360μs 2.4649μs 405.6976 KOps/s 403.1829 KOps/s $\color{#35bf28}+0.62\%$
test_creation_empty 55.2330μs 12.0572μs 82.9379 KOps/s 80.7407 KOps/s $\color{#35bf28}+2.72\%$
test_creation_nested_1 48.6010μs 14.9599μs 66.8453 KOps/s 65.8944 KOps/s $\color{#35bf28}+1.44\%$
test_creation_nested_2 71.2430μs 19.3299μs 51.7334 KOps/s 49.1155 KOps/s $\textbf{\color{#35bf28}+5.33\%}$
test_clone 0.1324ms 13.8149μs 72.3856 KOps/s 74.0819 KOps/s $\color{#d91a1a}-2.29\%$
test_getitem[int] 0.9111ms 12.6159μs 79.2649 KOps/s 75.1468 KOps/s $\textbf{\color{#35bf28}+5.48\%}$
test_getitem[slice_int] 0.1437ms 24.2870μs 41.1743 KOps/s 39.6051 KOps/s $\color{#35bf28}+3.96\%$
test_getitem[range] 0.2091ms 49.6223μs 20.1522 KOps/s 19.0930 KOps/s $\textbf{\color{#35bf28}+5.55\%}$
test_getitem[tuple] 0.1613ms 21.1857μs 47.2016 KOps/s 48.0793 KOps/s $\color{#d91a1a}-1.83\%$
test_getitem[list] 0.1951ms 45.0990μs 22.1734 KOps/s 21.3076 KOps/s $\color{#35bf28}+4.06\%$
test_setitem_dim[int] 59.0400μs 24.4546μs 40.8922 KOps/s 37.8227 KOps/s $\textbf{\color{#35bf28}+8.12\%}$
test_setitem_dim[slice_int] 0.1004ms 50.3718μs 19.8524 KOps/s 19.4817 KOps/s $\color{#35bf28}+1.90\%$
test_setitem_dim[range] 0.1297ms 75.7385μs 13.2033 KOps/s 12.8687 KOps/s $\color{#35bf28}+2.60\%$
test_setitem_dim[tuple] 76.7430μs 40.0203μs 24.9873 KOps/s 23.9678 KOps/s $\color{#35bf28}+4.25\%$
test_setitem 95.4470μs 20.9946μs 47.6312 KOps/s 46.7676 KOps/s $\color{#35bf28}+1.85\%$
test_set 93.0330μs 20.3484μs 49.1440 KOps/s 48.1604 KOps/s $\color{#35bf28}+2.04\%$
test_set_shared 4.7566ms 0.1841ms 5.4322 KOps/s 5.3877 KOps/s $\color{#35bf28}+0.83\%$
test_update 0.1518ms 23.3543μs 42.8187 KOps/s 41.2737 KOps/s $\color{#35bf28}+3.74\%$
test_update_nested 0.1111ms 34.2653μs 29.1841 KOps/s 28.7382 KOps/s $\color{#35bf28}+1.55\%$
test_update__nested 0.5138ms 33.6621μs 29.7070 KOps/s 29.7508 KOps/s $\color{#d91a1a}-0.15\%$
test_set_nested 0.1439ms 22.5571μs 44.3319 KOps/s 44.1027 KOps/s $\color{#35bf28}+0.52\%$
test_set_nested_new 0.1023ms 26.6810μs 37.4799 KOps/s 35.9929 KOps/s $\color{#35bf28}+4.13\%$
test_select 0.1101ms 42.1049μs 23.7502 KOps/s 22.3412 KOps/s $\textbf{\color{#35bf28}+6.31\%}$
test_select_nested 0.1192ms 62.5018μs 15.9995 KOps/s 15.9605 KOps/s $\color{#35bf28}+0.24\%$
test_exclude_nested 0.1451ms 79.7787μs 12.5347 KOps/s 12.3764 KOps/s $\color{#35bf28}+1.28\%$
test_empty[True] 0.7430ms 0.3999ms 2.5004 KOps/s 2.4816 KOps/s $\color{#35bf28}+0.76\%$
test_empty[False] 22.2140μs 1.3567μs 737.1051 KOps/s 719.9135 KOps/s $\color{#35bf28}+2.39\%$
test_unbind_speed 0.3812ms 0.2715ms 3.6833 KOps/s 3.5002 KOps/s $\textbf{\color{#35bf28}+5.23\%}$
test_unbind_speed_stack0 0.4478ms 0.2648ms 3.7765 KOps/s 3.6231 KOps/s $\color{#35bf28}+4.23\%$
test_unbind_speed_stack1 0.1177s 0.7460ms 1.3405 KOps/s 1.3271 KOps/s $\color{#35bf28}+1.00\%$
test_split 0.1182s 1.7991ms 555.8222 Ops/s 551.8549 Ops/s $\color{#35bf28}+0.72\%$
test_chunk 0.1167s 1.7662ms 566.1730 Ops/s 547.5572 Ops/s $\color{#35bf28}+3.40\%$
test_consolidate_njt[False-None] 8.5246ms 7.9534ms 125.7317 Ops/s 119.6020 Ops/s $\textbf{\color{#35bf28}+5.13\%}$
test_creation[device0] 0.2937ms 92.0728μs 10.8610 KOps/s 10.7947 KOps/s $\color{#35bf28}+0.61\%$
test_creation_from_tensor 3.8100ms 95.9512μs 10.4220 KOps/s 10.2688 KOps/s $\color{#35bf28}+1.49\%$
test_add_one[memmap_tensor0] 0.1087ms 5.2188μs 191.6149 KOps/s 194.1821 KOps/s $\color{#d91a1a}-1.32\%$
test_contiguous[memmap_tensor0] 12.5630μs 0.5042μs 1.9834 MOps/s 2.0139 MOps/s $\color{#d91a1a}-1.51\%$
test_stack[memmap_tensor0] 19.2460μs 3.4329μs 291.3028 KOps/s 284.3938 KOps/s $\color{#35bf28}+2.43\%$
test_memmaptd_index 1.4763ms 0.2296ms 4.3558 KOps/s 4.2514 KOps/s $\color{#35bf28}+2.46\%$
test_memmaptd_index_astensor 0.6724ms 0.3134ms 3.1912 KOps/s 3.1079 KOps/s $\color{#35bf28}+2.68\%$
test_memmaptd_index_op 1.0234ms 0.5981ms 1.6719 KOps/s 1.6108 KOps/s $\color{#35bf28}+3.79\%$
test_serialize_model 0.2338s 0.1374s 7.2762 Ops/s 7.6736 Ops/s $\textbf{\color{#d91a1a}-5.18\%}$
test_serialize_model_pickle 0.4495s 0.3961s 2.5249 Ops/s 2.5356 Ops/s $\color{#d91a1a}-0.42\%$
test_serialize_weights 0.1212s 0.1146s 8.7250 Ops/s 8.7044 Ops/s $\color{#35bf28}+0.24\%$
test_serialize_weights_returnearly 0.1714s 0.1589s 6.2919 Ops/s 6.2357 Ops/s $\color{#35bf28}+0.90\%$
test_serialize_weights_pickle 1.2488s 0.7093s 1.4098 Ops/s 2.5169 Ops/s $\textbf{\color{#d91a1a}-43.99\%}$
test_serialize_weights_filesystem 0.1554s 0.1414s 7.0744 Ops/s 6.1594 Ops/s $\textbf{\color{#35bf28}+14.86\%}$
test_serialize_model_filesystem 0.1477s 0.1425s 7.0179 Ops/s 6.5015 Ops/s $\textbf{\color{#35bf28}+7.94\%}$
test_reshape_pytree 57.1660μs 26.1035μs 38.3091 KOps/s 37.8446 KOps/s $\color{#35bf28}+1.23\%$
test_reshape_td 66.4440μs 32.4738μs 30.7940 KOps/s 30.4783 KOps/s $\color{#35bf28}+1.04\%$
test_view_pytree 79.4180μs 26.1308μs 38.2690 KOps/s 38.3334 KOps/s $\color{#d91a1a}-0.17\%$
test_view_td 86.0710μs 40.6592μs 24.5947 KOps/s 24.1626 KOps/s $\color{#35bf28}+1.79\%$
test_unbind_pytree 68.3480μs 28.8901μs 34.6139 KOps/s 33.6925 KOps/s $\color{#35bf28}+2.73\%$
test_unbind_td 0.3108ms 39.1480μs 25.5441 KOps/s 23.9361 KOps/s $\textbf{\color{#35bf28}+6.72\%}$
test_split_pytree 66.6740μs 28.5756μs 34.9949 KOps/s 34.4071 KOps/s $\color{#35bf28}+1.71\%$
test_split_td 0.5000ms 44.8492μs 22.2969 KOps/s 21.5978 KOps/s $\color{#35bf28}+3.24\%$
test_add_pytree 76.2620μs 35.5719μs 28.1121 KOps/s 27.0514 KOps/s $\color{#35bf28}+3.92\%$
test_add_td 0.1266ms 57.6482μs 17.3466 KOps/s 16.7729 KOps/s $\color{#35bf28}+3.42\%$
test_compile_add_one_nested[tensordict-compile] 0.1146ms 66.3548μs 15.0705 KOps/s 15.0427 KOps/s $\color{#35bf28}+0.18\%$
test_compile_add_one_nested[tensordict-eager] 1.5399ms 0.1729ms 5.7822 KOps/s 5.8524 KOps/s $\color{#d91a1a}-1.20\%$
test_compile_add_one_nested[pytree-compile] 0.1212ms 45.8812μs 21.7954 KOps/s 22.0147 KOps/s $\color{#d91a1a}-1.00\%$
test_compile_add_one_nested[pytree-eager] 0.1830ms 0.1199ms 8.3396 KOps/s 8.3201 KOps/s $\color{#35bf28}+0.23\%$
test_compile_copy_nested[tensordict-compile] 69.0990μs 29.1794μs 34.2707 KOps/s 35.7930 KOps/s $\color{#d91a1a}-4.25\%$
test_compile_copy_nested[tensordict-eager] 0.1134ms 57.6474μs 17.3468 KOps/s 16.9820 KOps/s $\color{#35bf28}+2.15\%$
test_compile_copy_nested[pytree-compile] 0.1360ms 78.3757μs 12.7591 KOps/s 12.2604 KOps/s $\color{#35bf28}+4.07\%$
test_compile_copy_nested[pytree-eager] 0.1514ms 65.6367μs 15.2354 KOps/s 14.9302 KOps/s $\color{#35bf28}+2.04\%$
test_compile_add_one_flat[tensordict-compile] 0.1830ms 0.1081ms 9.2508 KOps/s 9.3163 KOps/s $\color{#d91a1a}-0.70\%$
test_compile_add_one_flat[tensordict-eager] 0.2827ms 0.2159ms 4.6321 KOps/s 4.6412 KOps/s $\color{#d91a1a}-0.20\%$
test_compile_add_one_flat[tensorclass-compile] 0.1148ms 47.9894μs 20.8379 KOps/s 21.5197 KOps/s $\color{#d91a1a}-3.17\%$
test_compile_add_one_flat[tensorclass-eager] 0.1379ms 68.8268μs 14.5292 KOps/s 15.0282 KOps/s $\color{#d91a1a}-3.32\%$
test_compile_add_one_flat[pytree-compile] 0.1741ms 0.1021ms 9.7960 KOps/s 10.0180 KOps/s $\color{#d91a1a}-2.22\%$
test_compile_add_one_flat[pytree-eager] 0.3641ms 0.2035ms 4.9138 KOps/s 4.8948 KOps/s $\color{#35bf28}+0.39\%$
test_compile_add_self_flat[tensordict-eager] 0.4243ms 0.2312ms 4.3244 KOps/s 4.3348 KOps/s $\color{#d91a1a}-0.24\%$
test_compile_add_self_flat[tensordict-compile] 0.1904ms 0.1085ms 9.2141 KOps/s 9.2127 KOps/s $\color{#35bf28}+0.02\%$
test_compile_add_self_flat[tensorclass-eager] 0.1700ms 65.7076μs 15.2190 KOps/s 15.7934 KOps/s $\color{#d91a1a}-3.64\%$
test_compile_add_self_flat[tensorclass-compile] 0.1009ms 48.7201μs 20.5254 KOps/s 20.6954 KOps/s $\color{#d91a1a}-0.82\%$
test_compile_add_self_flat[pytree-eager] 0.2948ms 0.1580ms 6.3286 KOps/s 6.2925 KOps/s $\color{#35bf28}+0.57\%$
test_compile_add_self_flat[pytree-compile] 0.1847ms 0.1032ms 9.6942 KOps/s 9.9714 KOps/s $\color{#d91a1a}-2.78\%$
test_compile_copy_flat[tensordict-compile] 52.4880μs 22.3563μs 44.7301 KOps/s 46.9171 KOps/s $\color{#d91a1a}-4.66\%$
test_compile_copy_flat[tensordict-eager] 0.1292ms 66.6908μs 14.9946 KOps/s 14.9486 KOps/s $\color{#35bf28}+0.31\%$
test_compile_copy_flat[pytree-compile] 0.1621ms 82.4780μs 12.1244 KOps/s 12.2005 KOps/s $\color{#d91a1a}-0.62\%$
test_compile_copy_flat[pytree-eager] 0.1424ms 68.1572μs 14.6720 KOps/s 14.9660 KOps/s $\color{#d91a1a}-1.96\%$
test_compile_assign_and_add[tensordict-compile] 0.3389ms 0.2173ms 4.6024 KOps/s 4.6935 KOps/s $\color{#d91a1a}-1.94\%$
test_compile_assign_and_add[tensordict-eager] 1.6796ms 1.3869ms 721.0552 Ops/s 715.6496 Ops/s $\color{#35bf28}+0.76\%$
test_compile_assign_and_add[pytree-compile] 0.3320ms 0.2097ms 4.7687 KOps/s 4.7844 KOps/s $\color{#d91a1a}-0.33\%$
test_compile_assign_and_add[pytree-eager] 0.8944ms 0.8278ms 1.2080 KOps/s 1.1866 KOps/s $\color{#35bf28}+1.80\%$
test_compile_assign_and_add_stack[compile] 0.5659ms 0.4601ms 2.1737 KOps/s 2.1908 KOps/s $\color{#d91a1a}-0.78\%$
test_compile_assign_and_add_stack[eager] 5.9013ms 2.7844ms 359.1436 Ops/s 356.1348 Ops/s $\color{#35bf28}+0.84\%$
test_compile_indexing[tensor-tensordict-compile] 0.1029ms 39.5588μs 25.2788 KOps/s 26.1761 KOps/s $\color{#d91a1a}-3.43\%$
test_compile_indexing[tensor-tensordict-eager] 0.5805ms 33.4850μs 29.8641 KOps/s 28.9679 KOps/s $\color{#35bf28}+3.09\%$
test_compile_indexing[tensor-tensorclass-compile] 70.3810μs 31.2231μs 32.0275 KOps/s 32.1699 KOps/s $\color{#d91a1a}-0.44\%$
test_compile_indexing[tensor-tensorclass-eager] 81.4120μs 22.9394μs 43.5931 KOps/s 43.6558 KOps/s $\color{#d91a1a}-0.14\%$
test_compile_indexing[tensor-pytree-compile] 87.5940μs 32.6653μs 30.6135 KOps/s 31.7440 KOps/s $\color{#d91a1a}-3.56\%$
test_compile_indexing[tensor-pytree-eager] 60.4530μs 22.8923μs 43.6828 KOps/s 43.5594 KOps/s $\color{#35bf28}+0.28\%$
test_compile_indexing[slice-tensordict-compile] 0.1143ms 54.4876μs 18.3528 KOps/s 18.8602 KOps/s $\color{#d91a1a}-2.69\%$
test_compile_indexing[slice-tensordict-eager] 0.4794ms 19.8321μs 50.4233 KOps/s 47.4330 KOps/s $\textbf{\color{#35bf28}+6.30\%}$
test_compile_indexing[slice-tensorclass-compile] 0.3071ms 46.7843μs 21.3747 KOps/s 21.6566 KOps/s $\color{#d91a1a}-1.30\%$
test_compile_indexing[slice-tensorclass-eager] 86.0000μs 18.4432μs 54.2206 KOps/s 52.9776 KOps/s $\color{#35bf28}+2.35\%$
test_compile_indexing[slice-pytree-compile] 0.1178ms 47.3489μs 21.1198 KOps/s 21.0983 KOps/s $\color{#35bf28}+0.10\%$
test_compile_indexing[slice-pytree-eager] 81.2710μs 18.5560μs 53.8908 KOps/s 53.0825 KOps/s $\color{#35bf28}+1.52\%$
test_compile_indexing[int-tensordict-compile] 0.1234ms 55.0071μs 18.1795 KOps/s 18.2946 KOps/s $\color{#d91a1a}-0.63\%$
test_compile_indexing[int-tensordict-eager] 0.8346ms 19.5004μs 51.2810 KOps/s 47.6374 KOps/s $\textbf{\color{#35bf28}+7.65\%}$
test_compile_indexing[int-tensorclass-compile] 91.4210μs 46.3826μs 21.5598 KOps/s 21.3290 KOps/s $\color{#35bf28}+1.08\%$
test_compile_indexing[int-tensorclass-eager] 0.2501ms 18.5885μs 53.7968 KOps/s 53.4589 KOps/s $\color{#35bf28}+0.63\%$
test_compile_indexing[int-pytree-compile] 0.1130ms 47.0299μs 21.2631 KOps/s 21.3329 KOps/s $\color{#d91a1a}-0.33\%$
test_compile_indexing[int-pytree-eager] 52.2570μs 18.8541μs 53.0389 KOps/s 52.3639 KOps/s $\color{#35bf28}+1.29\%$
test_mod_add[eager] 0.2291ms 36.6435μs 27.2900 KOps/s 27.2470 KOps/s $\color{#35bf28}+0.16\%$
test_mod_add[compile] 0.1775ms 66.4681μs 15.0448 KOps/s 15.1585 KOps/s $\color{#d91a1a}-0.75\%$
test_mod_add[compile-overhead] 0.1194ms 65.1540μs 15.3483 KOps/s 15.4324 KOps/s $\color{#d91a1a}-0.55\%$
test_mod_wrap[eager] 0.3798ms 0.2248ms 4.4474 KOps/s 4.2733 KOps/s $\color{#35bf28}+4.08\%$
test_mod_wrap[compile] 1.9177ms 0.2318ms 4.3138 KOps/s 4.3156 KOps/s $\color{#d91a1a}-0.04\%$
test_mod_wrap[compile-overhead] 0.4157ms 0.2292ms 4.3628 KOps/s 4.3763 KOps/s $\color{#d91a1a}-0.31\%$
test_mod_wrap_and_backward[eager] 19.2336ms 14.0778ms 71.0340 Ops/s 75.9768 Ops/s $\textbf{\color{#d91a1a}-6.51\%}$
test_mod_wrap_and_backward[compile] 15.2489ms 11.8385ms 84.4699 Ops/s 85.7320 Ops/s $\color{#d91a1a}-1.47\%$
test_mod_wrap_and_backward[compile-overhead] 15.5252ms 11.9681ms 83.5553 Ops/s 86.7800 Ops/s $\color{#d91a1a}-3.72\%$
test_seq_add[eager] 0.2171ms 0.1168ms 8.5615 KOps/s 8.0925 KOps/s $\textbf{\color{#35bf28}+5.79\%}$
test_seq_add[compile] 0.1418ms 79.4022μs 12.5941 KOps/s 12.6755 KOps/s $\color{#d91a1a}-0.64\%$
test_seq_add[compile-overhead] 0.1416ms 77.0098μs 12.9854 KOps/s 13.2641 KOps/s $\color{#d91a1a}-2.10\%$
test_seq_wrap[eager] 0.6130ms 0.4506ms 2.2192 KOps/s 2.1705 KOps/s $\color{#35bf28}+2.24\%$
test_seq_wrap[compile] 0.4527ms 0.2436ms 4.1056 KOps/s 4.0610 KOps/s $\color{#35bf28}+1.10\%$
test_seq_wrap[compile-overhead] 0.4525ms 0.2426ms 4.1224 KOps/s 4.0874 KOps/s $\color{#35bf28}+0.86\%$
test_func_call_runtime[False-eager] 1.0044ms 0.5307ms 1.8845 KOps/s 1.7915 KOps/s $\textbf{\color{#35bf28}+5.19\%}$
test_func_call_runtime[False-compile] 0.9337ms 0.4602ms 2.1728 KOps/s 2.2192 KOps/s $\color{#d91a1a}-2.09\%$
test_func_call_runtime[False-compile-overhead] 0.8251ms 0.4598ms 2.1751 KOps/s 2.2292 KOps/s $\color{#d91a1a}-2.43\%$
test_func_call_runtime[True-eager] 0.9423ms 0.7446ms 1.3431 KOps/s 1.3019 KOps/s $\color{#35bf28}+3.16\%$
test_func_call_runtime[True-compile] 0.6717ms 0.4783ms 2.0907 KOps/s 2.1087 KOps/s $\color{#d91a1a}-0.85\%$
test_func_call_runtime[True-compile-overhead] 0.8781ms 0.4824ms 2.0729 KOps/s 2.1001 KOps/s $\color{#d91a1a}-1.29\%$
test_func_call_cm_runtime[False-eager] 0.8341ms 0.5267ms 1.8986 KOps/s 1.8372 KOps/s $\color{#35bf28}+3.34\%$
test_func_call_cm_runtime[False-compile] 0.5903ms 0.4594ms 2.1768 KOps/s 2.2154 KOps/s $\color{#d91a1a}-1.74\%$
test_func_call_cm_runtime[False-compile-overhead] 0.6116ms 0.4542ms 2.2016 KOps/s 2.2146 KOps/s $\color{#d91a1a}-0.59\%$
test_func_call_cm_runtime[True-eager] 1.0943ms 0.8919ms 1.1213 KOps/s 1.0957 KOps/s $\color{#35bf28}+2.33\%$
test_func_call_cm_runtime[True-compile] 1.2934ms 0.7978ms 1.2535 KOps/s 1.2238 KOps/s $\color{#35bf28}+2.42\%$
test_func_call_cm_runtime[True-compile-overhead] 1.1482ms 0.8087ms 1.2366 KOps/s 1.2195 KOps/s $\color{#35bf28}+1.41\%$
test_vmap_func_call_cm_runtime[eager] 2.7302ms 1.9641ms 509.1433 Ops/s 508.6258 Ops/s $\color{#35bf28}+0.10\%$
test_vmap_func_call_cm_runtime[compile] 0.8817ms 0.5459ms 1.8317 KOps/s 1.8303 KOps/s $\color{#35bf28}+0.08\%$
test_vmap_func_call_cm_runtime[compile-overhead] 1.0548ms 0.5337ms 1.8736 KOps/s 1.8312 KOps/s $\color{#35bf28}+2.32\%$
test_distributed 1.8376ms 0.1275ms 7.8425 KOps/s 7.6887 KOps/s $\color{#35bf28}+2.00\%$
test_tdmodule 48.7410μs 27.2435μs 36.7060 KOps/s 34.6641 KOps/s $\textbf{\color{#35bf28}+5.89\%}$
test_tdmodule_dispatch 76.1920μs 50.6658μs 19.7372 KOps/s 19.4587 KOps/s $\color{#35bf28}+1.43\%$
test_tdseq 68.8480μs 28.7305μs 34.8063 KOps/s 31.1163 KOps/s $\textbf{\color{#35bf28}+11.86\%}$
test_tdseq_dispatch 0.1009ms 55.6459μs 17.9708 KOps/s 17.2826 KOps/s $\color{#35bf28}+3.98\%$
test_instantiation_functorch 1.8158ms 1.5294ms 653.8487 Ops/s 639.7495 Ops/s $\color{#35bf28}+2.20\%$
test_exec_functorch 0.3219ms 0.1785ms 5.6007 KOps/s 5.5059 KOps/s $\color{#35bf28}+1.72\%$
test_exec_functional_call 0.4284ms 0.1758ms 5.6899 KOps/s 5.5560 KOps/s $\color{#35bf28}+2.41\%$
test_exec_td_decorator 0.5529ms 0.2358ms 4.2410 KOps/s 4.2077 KOps/s $\color{#35bf28}+0.79\%$
test_vmap_mlp_speed_decorator[True-True] 0.8864ms 0.6622ms 1.5101 KOps/s 1.4624 KOps/s $\color{#35bf28}+3.26\%$
test_vmap_mlp_speed_decorator[True-False] 0.9552ms 0.6588ms 1.5179 KOps/s 1.4817 KOps/s $\color{#35bf28}+2.44\%$
test_vmap_mlp_speed_decorator[False-True] 0.7756ms 0.5341ms 1.8724 KOps/s 1.8520 KOps/s $\color{#35bf28}+1.10\%$
test_vmap_mlp_speed_decorator[False-False] 0.8406ms 0.5331ms 1.8758 KOps/s 1.8448 KOps/s $\color{#35bf28}+1.68\%$
test_to_module_speed[True] 1.7920ms 1.3460ms 742.9190 Ops/s 748.9401 Ops/s $\color{#d91a1a}-0.80\%$
test_to_module_speed[False] 1.8161ms 1.3093ms 763.7872 Ops/s 762.5334 Ops/s $\color{#35bf28}+0.16\%$
test_tc_init 88.1240μs 47.0230μs 21.2662 KOps/s 21.5673 KOps/s $\color{#d91a1a}-1.40\%$
test_tc_init_nested 0.1776ms 92.7126μs 10.7860 KOps/s 10.8421 KOps/s $\color{#d91a1a}-0.52\%$
test_tc_first_layer_tensor 24.2360μs 1.5244μs 656.0125 KOps/s 657.4953 KOps/s $\color{#d91a1a}-0.23\%$
test_tc_first_layer_nontensor 37.2490μs 4.6631μs 214.4512 KOps/s 212.2381 KOps/s $\color{#35bf28}+1.04\%$
test_tc_second_layer_tensor 41.4980μs 2.8347μs 352.7658 KOps/s 351.7544 KOps/s $\color{#35bf28}+0.29\%$
test_tc_second_layer_nontensor 36.1280μs 5.9795μs 167.2385 KOps/s 167.5207 KOps/s $\color{#d91a1a}-0.17\%$
test_unbind 0.2496s 13.7236ms 72.8674 Ops/s 61.0796 Ops/s $\textbf{\color{#35bf28}+19.30\%}$
test_full_like 9.8533ms 8.9375ms 111.8884 Ops/s 110.4715 Ops/s $\color{#35bf28}+1.28\%$
test_zeros_like 4.5841ms 3.1326ms 319.2235 Ops/s 291.7101 Ops/s $\textbf{\color{#35bf28}+9.43\%}$
test_ones_like 4.7212ms 3.4424ms 290.4918 Ops/s 266.1826 Ops/s $\textbf{\color{#35bf28}+9.13\%}$
test_clone 6.7760ms 5.4144ms 184.6943 Ops/s 170.7657 Ops/s $\textbf{\color{#35bf28}+8.16\%}$
test_squeeze 70.6020μs 12.6189μs 79.2464 KOps/s 77.8174 KOps/s $\color{#35bf28}+1.84\%$
test_unsqueeze 0.1688ms 93.3263μs 10.7151 KOps/s 10.4098 KOps/s $\color{#35bf28}+2.93\%$
test_split 0.4892ms 0.1929ms 5.1832 KOps/s 5.0210 KOps/s $\color{#35bf28}+3.23\%$
test_permute 0.3776ms 0.2024ms 4.9417 KOps/s 4.8355 KOps/s $\color{#35bf28}+2.20\%$
test_stack 34.2332ms 25.8088ms 38.7465 Ops/s 37.7087 Ops/s $\color{#35bf28}+2.75\%$
test_cat 33.7938ms 25.8652ms 38.6620 Ops/s 38.1060 Ops/s $\color{#35bf28}+1.46\%$

raise KeyError(
f"got keys {keys} and {set(td.keys())} which are incompatible"
)
return keys
if strict:
return keys
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should actually make it a list

return keys
if strict:
return keys
return keys_set
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If keys can be exclusive, their order becomes arbitrary

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By curiosity, what are the downstream functions that would be impacted by this? In other words, in which context is _check_keys(strict=False) used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes when using lazy stacks iirc

Copy link

github-actions bot commented Feb 22, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 229. Improved: $\large\color{#35bf28}17$. Worsened: $\large\color{#d91a1a}12$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 28.2000μs 13.0047μs 76.8955 KOps/s 79.2964 KOps/s $\color{#d91a1a}-3.03\%$
test_plain_set_stack_nested 43.7810μs 13.1528μs 76.0296 KOps/s 78.1436 KOps/s $\color{#d91a1a}-2.71\%$
test_plain_set_nested_inplace 0.4074ms 14.2059μs 70.3932 KOps/s 72.7399 KOps/s $\color{#d91a1a}-3.23\%$
test_plain_set_stack_nested_inplace 0.3997ms 14.0612μs 71.1177 KOps/s 72.8358 KOps/s $\color{#d91a1a}-2.36\%$
test_items 28.7600μs 2.9000μs 344.8305 KOps/s 340.4051 KOps/s $\color{#35bf28}+1.30\%$
test_items_nested 0.7489ms 0.3676ms 2.7201 KOps/s 2.7614 KOps/s $\color{#d91a1a}-1.50\%$
test_items_nested_locked 0.7446ms 0.3634ms 2.7518 KOps/s 2.7564 KOps/s $\color{#d91a1a}-0.17\%$
test_items_nested_leaf 0.4461ms 60.4382μs 16.5458 KOps/s 16.5079 KOps/s $\color{#35bf28}+0.23\%$
test_items_stack_nested 0.7494ms 0.3619ms 2.7635 KOps/s 2.7834 KOps/s $\color{#d91a1a}-0.71\%$
test_items_stack_nested_leaf 94.4420μs 60.5620μs 16.5120 KOps/s 16.4225 KOps/s $\color{#35bf28}+0.54\%$
test_items_stack_nested_locked 0.7610ms 0.3676ms 2.7205 KOps/s 2.7492 KOps/s $\color{#d91a1a}-1.04\%$
test_keys 0.3879ms 3.4944μs 286.1681 KOps/s 291.5206 KOps/s $\color{#d91a1a}-1.84\%$
test_keys_nested 0.4714ms 88.6287μs 11.2830 KOps/s 11.4948 KOps/s $\color{#d91a1a}-1.84\%$
test_keys_nested_locked 0.8432ms 94.2892μs 10.6057 KOps/s 10.8126 KOps/s $\color{#d91a1a}-1.91\%$
test_keys_nested_leaf 0.1046ms 79.4801μs 12.5818 KOps/s 12.8115 KOps/s $\color{#d91a1a}-1.79\%$
test_keys_stack_nested 0.4741ms 87.7444μs 11.3967 KOps/s 11.5249 KOps/s $\color{#d91a1a}-1.11\%$
test_keys_stack_nested_leaf 0.4623ms 79.2761μs 12.6141 KOps/s 12.8244 KOps/s $\color{#d91a1a}-1.64\%$
test_keys_stack_nested_locked 0.4885ms 93.5064μs 10.6945 KOps/s 10.8617 KOps/s $\color{#d91a1a}-1.54\%$
test_values 64.1110μs 0.8549μs 1.1698 MOps/s 1.1566 MOps/s $\color{#35bf28}+1.14\%$
test_values_nested 0.4281ms 37.1446μs 26.9218 KOps/s 27.3140 KOps/s $\color{#d91a1a}-1.44\%$
test_values_nested_locked 0.4279ms 39.1748μs 25.5266 KOps/s 25.9346 KOps/s $\color{#d91a1a}-1.57\%$
test_values_nested_leaf 0.4304ms 42.4965μs 23.5313 KOps/s 23.7715 KOps/s $\color{#d91a1a}-1.01\%$
test_values_stack_nested 0.4228ms 37.6970μs 26.5273 KOps/s 27.1806 KOps/s $\color{#d91a1a}-2.40\%$
test_values_stack_nested_leaf 71.4210μs 42.4535μs 23.5552 KOps/s 23.6614 KOps/s $\color{#d91a1a}-0.45\%$
test_values_stack_nested_locked 0.4422ms 39.1398μs 25.5494 KOps/s 25.7724 KOps/s $\color{#d91a1a}-0.87\%$
test_membership 19.9839μs 0.5013μs 1.9949 MOps/s 2.0157 MOps/s $\color{#d91a1a}-1.03\%$
test_membership_nested 0.2004ms 2.0075μs 498.1239 KOps/s 496.5029 KOps/s $\color{#35bf28}+0.33\%$
test_membership_nested_leaf 0.1977ms 2.0518μs 487.3661 KOps/s 498.6331 KOps/s $\color{#d91a1a}-2.26\%$
test_membership_stacked_nested 26.4400μs 2.1237μs 470.8866 KOps/s 482.1684 KOps/s $\color{#d91a1a}-2.34\%$
test_membership_stacked_nested_leaf 0.3928ms 2.1009μs 475.9909 KOps/s 481.1493 KOps/s $\color{#d91a1a}-1.07\%$
test_membership_nested_last 32.9910μs 3.1255μs 319.9494 KOps/s 325.0588 KOps/s $\color{#d91a1a}-1.57\%$
test_membership_nested_leaf_last 0.4245ms 3.1198μs 320.5371 KOps/s 320.2674 KOps/s $\color{#35bf28}+0.08\%$
test_membership_stacked_nested_last 30.0410μs 3.0979μs 322.8004 KOps/s 328.1469 KOps/s $\color{#d91a1a}-1.63\%$
test_membership_stacked_nested_leaf_last 27.2400μs 3.0960μs 322.9954 KOps/s 327.4948 KOps/s $\color{#d91a1a}-1.37\%$
test_nested_getleaf 0.4161ms 6.2310μs 160.4870 KOps/s 161.6806 KOps/s $\color{#d91a1a}-0.74\%$
test_nested_get 0.3996ms 5.9221μs 168.8582 KOps/s 166.0395 KOps/s $\color{#35bf28}+1.70\%$
test_stacked_getleaf 28.4000μs 6.1399μs 162.8691 KOps/s 162.6176 KOps/s $\color{#35bf28}+0.15\%$
test_stacked_get 0.4027ms 5.8258μs 171.6497 KOps/s 173.0998 KOps/s $\color{#d91a1a}-0.84\%$
test_nested_getitemleaf 35.7600μs 6.3632μs 157.1541 KOps/s 154.5819 KOps/s $\color{#35bf28}+1.66\%$
test_nested_getitem 0.3916ms 6.0739μs 164.6386 KOps/s 164.2798 KOps/s $\color{#35bf28}+0.22\%$
test_stacked_getitemleaf 43.1610μs 6.3967μs 156.3312 KOps/s 155.9057 KOps/s $\color{#35bf28}+0.27\%$
test_stacked_getitem 0.3957ms 5.9702μs 167.4975 KOps/s 166.2375 KOps/s $\color{#35bf28}+0.76\%$
test_lock_nested 9.6300ms 0.3446ms 2.9020 KOps/s 2.8738 KOps/s $\color{#35bf28}+0.98\%$
test_lock_stack_nested 0.4041ms 0.3382ms 2.9566 KOps/s 2.9005 KOps/s $\color{#35bf28}+1.93\%$
test_unlock_nested 0.3892ms 0.2815ms 3.5525 KOps/s 3.5642 KOps/s $\color{#d91a1a}-0.33\%$
test_unlock_stack_nested 0.3187ms 0.2791ms 3.5824 KOps/s 3.5353 KOps/s $\color{#35bf28}+1.33\%$
test_flatten_speed 0.1080ms 77.9501μs 12.8287 KOps/s 12.9273 KOps/s $\color{#d91a1a}-0.76\%$
test_unflatten_speed 0.7034ms 0.3199ms 3.1264 KOps/s 3.1381 KOps/s $\color{#d91a1a}-0.37\%$
test_common_ops 0.7644ms 0.6266ms 1.5960 KOps/s 1.6326 KOps/s $\color{#d91a1a}-2.24\%$
test_creation 0.1402ms 1.7515μs 570.9505 KOps/s 576.4118 KOps/s $\color{#d91a1a}-0.95\%$
test_creation_empty 0.3981ms 9.5693μs 104.5013 KOps/s 112.0848 KOps/s $\textbf{\color{#d91a1a}-6.77\%}$
test_creation_nested_1 37.8410μs 11.3535μs 88.0789 KOps/s 94.6292 KOps/s $\textbf{\color{#d91a1a}-6.92\%}$
test_creation_nested_2 47.9000μs 13.9463μs 71.7034 KOps/s 74.6022 KOps/s $\color{#d91a1a}-3.89\%$
test_clone 0.4034ms 10.5967μs 94.3693 KOps/s 90.8953 KOps/s $\color{#35bf28}+3.82\%$
test_getitem[int] 1.3544ms 10.5696μs 94.6105 KOps/s 93.3829 KOps/s $\color{#35bf28}+1.31\%$
test_getitem[slice_int] 0.1112ms 20.2142μs 49.4701 KOps/s 47.6388 KOps/s $\color{#35bf28}+3.84\%$
test_getitem[range] 0.1522ms 36.6548μs 27.2816 KOps/s 26.7366 KOps/s $\color{#35bf28}+2.04\%$
test_getitem[tuple] 0.4192ms 17.8721μs 55.9531 KOps/s 54.8944 KOps/s $\color{#35bf28}+1.93\%$
test_getitem[list] 0.1354ms 31.9175μs 31.3307 KOps/s 29.6886 KOps/s $\textbf{\color{#35bf28}+5.53\%}$
test_setitem_dim[int] 39.3300μs 18.5717μs 53.8454 KOps/s 50.9990 KOps/s $\textbf{\color{#35bf28}+5.58\%}$
test_setitem_dim[slice_int] 70.9410μs 37.9418μs 26.3561 KOps/s 25.8934 KOps/s $\color{#35bf28}+1.79\%$
test_setitem_dim[range] 78.2010μs 51.7807μs 19.3122 KOps/s 18.8563 KOps/s $\color{#35bf28}+2.42\%$
test_setitem_dim[tuple] 52.4510μs 31.9635μs 31.2857 KOps/s 31.4109 KOps/s $\color{#d91a1a}-0.40\%$
test_setitem 45.9210μs 15.6339μs 63.9634 KOps/s 63.2270 KOps/s $\color{#35bf28}+1.16\%$
test_set 61.8210μs 15.3076μs 65.3271 KOps/s 66.6374 KOps/s $\color{#d91a1a}-1.97\%$
test_set_shared 0.5855ms 0.1573ms 6.3571 KOps/s 6.3795 KOps/s $\color{#d91a1a}-0.35\%$
test_update 0.4163ms 19.1607μs 52.1901 KOps/s 55.1358 KOps/s $\textbf{\color{#d91a1a}-5.34\%}$
test_update_nested 60.1110μs 24.3641μs 41.0440 KOps/s 41.5892 KOps/s $\color{#d91a1a}-1.31\%$
test_update__nested 0.5553ms 25.1373μs 39.7815 KOps/s 39.3940 KOps/s $\color{#35bf28}+0.98\%$
test_set_nested 58.0210μs 16.9137μs 59.1237 KOps/s 59.6240 KOps/s $\color{#d91a1a}-0.84\%$
test_set_nested_new 55.2410μs 19.0586μs 52.4699 KOps/s 53.8588 KOps/s $\color{#d91a1a}-2.58\%$
test_select 0.4499ms 30.8582μs 32.4063 KOps/s 33.4355 KOps/s $\color{#d91a1a}-3.08\%$
test_select_nested 0.4604ms 43.6235μs 22.9234 KOps/s 22.8678 KOps/s $\color{#35bf28}+0.24\%$
test_exclude_nested 0.1034ms 63.0002μs 15.8730 KOps/s 15.8760 KOps/s $\color{#d91a1a}-0.02\%$
test_empty[True] 0.6913ms 0.2974ms 3.3623 KOps/s 3.4229 KOps/s $\color{#d91a1a}-1.77\%$
test_empty[False] 39.2237μs 0.8240μs 1.2135 MOps/s 1.2004 MOps/s $\color{#35bf28}+1.10\%$
test_to 92.7610μs 55.0285μs 18.1724 KOps/s 18.2858 KOps/s $\color{#d91a1a}-0.62\%$
test_to_nonblocking 0.4475ms 46.7800μs 21.3767 KOps/s 20.8156 KOps/s $\color{#35bf28}+2.70\%$
test_unbind_speed 0.2737ms 0.2395ms 4.1752 KOps/s 4.2048 KOps/s $\color{#d91a1a}-0.70\%$
test_unbind_speed_stack0 0.6293ms 0.2342ms 4.2701 KOps/s 4.1388 KOps/s $\color{#35bf28}+3.17\%$
test_unbind_speed_stack1 95.4707ms 0.7374ms 1.3561 KOps/s 1.3403 KOps/s $\color{#35bf28}+1.18\%$
test_split 96.7774ms 1.5768ms 634.1976 Ops/s 627.9906 Ops/s $\color{#35bf28}+0.99\%$
test_chunk 0.1017s 1.5819ms 632.1510 Ops/s 622.4277 Ops/s $\color{#35bf28}+1.56\%$
test_consolidate[False-None] 0.1013s 2.9856ms 334.9453 Ops/s 337.4001 Ops/s $\color{#d91a1a}-0.73\%$
test_consolidate[default-None] 2.1002ms 1.6850ms 593.4672 Ops/s 599.3801 Ops/s $\color{#d91a1a}-0.99\%$
test_consolidate[reduce-overhead-None] 1.9010ms 1.7115ms 584.2972 Ops/s 581.4107 Ops/s $\color{#35bf28}+0.50\%$
test_consolidate_njt[False-None] 6.9941ms 6.6016ms 151.4778 Ops/s 155.8323 Ops/s $\color{#d91a1a}-2.79\%$
test_to[False-False-None] 1.8178ms 1.6766ms 596.4608 Ops/s 592.2260 Ops/s $\color{#35bf28}+0.72\%$
test_to[True-False-None] 1.6539ms 1.3043ms 766.6979 Ops/s 770.2144 Ops/s $\color{#d91a1a}-0.46\%$
test_to[within-False-None] 4.2283ms 4.0796ms 245.1236 Ops/s 244.9105 Ops/s $\color{#35bf28}+0.09\%$
test_to[True-default-None] 5.4602ms 5.0966ms 196.2086 Ops/s 196.9032 Ops/s $\color{#d91a1a}-0.35\%$
test_to_njt[False-False-None] 7.1291ms 6.8079ms 146.8889 Ops/s 144.9245 Ops/s $\color{#35bf28}+1.36\%$
test_to_njt[True-False-None] 5.8936ms 5.5129ms 181.3927 Ops/s 185.2851 Ops/s $\color{#d91a1a}-2.10\%$
test_to_njt[within-False-None] 12.8468ms 12.2341ms 81.7390 Ops/s 84.5975 Ops/s $\color{#d91a1a}-3.38\%$
test_creation[device0] 0.6443ms 78.8273μs 12.6860 KOps/s 12.3755 KOps/s $\color{#35bf28}+2.51\%$
test_creation_from_tensor 0.4458ms 85.5727μs 11.6860 KOps/s 11.7580 KOps/s $\color{#d91a1a}-0.61\%$
test_add_one[memmap_tensor0] 0.5303ms 6.6008μs 151.4957 KOps/s 147.6743 KOps/s $\color{#35bf28}+2.59\%$
test_contiguous[memmap_tensor0] 20.2833μs 0.4036μs 2.4777 MOps/s 2.4653 MOps/s $\color{#35bf28}+0.50\%$
test_stack[memmap_tensor0] 27.7510μs 4.2606μs 234.7080 KOps/s 229.3484 KOps/s $\color{#35bf28}+2.34\%$
test_memmaptd_index 1.8043ms 0.2380ms 4.2015 KOps/s 3.9429 KOps/s $\textbf{\color{#35bf28}+6.56\%}$
test_memmaptd_index_astensor 0.6987ms 0.2985ms 3.3500 KOps/s 3.2723 KOps/s $\color{#35bf28}+2.37\%$
test_memmaptd_index_op 0.9629ms 0.5896ms 1.6959 KOps/s 1.7074 KOps/s $\color{#d91a1a}-0.67\%$
test_serialize_model 0.1309s 0.1296s 7.7143 Ops/s 7.7120 Ops/s $\color{#35bf28}+0.03\%$
test_serialize_model_pickle 1.3483s 1.1896s 0.8406 Ops/s 0.8245 Ops/s $\color{#35bf28}+1.96\%$
test_serialize_weights 0.1303s 0.1291s 7.7444 Ops/s 7.7026 Ops/s $\color{#35bf28}+0.54\%$
test_serialize_weights_returnearly 0.3447s 61.6574ms 16.2186 Ops/s 11.4226 Ops/s $\textbf{\color{#35bf28}+41.99\%}$
test_serialize_weights_pickle 1.3779s 1.1903s 0.8401 Ops/s 0.8235 Ops/s $\color{#35bf28}+2.02\%$
test_reshape_pytree 49.3510μs 21.8449μs 45.7773 KOps/s 44.9844 KOps/s $\color{#35bf28}+1.76\%$
test_reshape_td 60.4410μs 25.7586μs 38.8220 KOps/s 36.4430 KOps/s $\textbf{\color{#35bf28}+6.53\%}$
test_view_pytree 52.0210μs 21.6275μs 46.2374 KOps/s 46.3249 KOps/s $\color{#d91a1a}-0.19\%$
test_view_td 70.5410μs 31.1466μs 32.1062 KOps/s 30.0345 KOps/s $\textbf{\color{#35bf28}+6.90\%}$
test_unbind_pytree 0.1261ms 27.7633μs 36.0188 KOps/s 35.3364 KOps/s $\color{#35bf28}+1.93\%$
test_unbind_td 0.6684ms 35.5819μs 28.1042 KOps/s 27.2754 KOps/s $\color{#35bf28}+3.04\%$
test_split_pytree 57.3010μs 28.9664μs 34.5228 KOps/s 33.9438 KOps/s $\color{#35bf28}+1.71\%$
test_split_td 0.8013ms 37.7831μs 26.4668 KOps/s 25.4574 KOps/s $\color{#35bf28}+3.97\%$
test_add_pytree 68.5210μs 33.1755μs 30.1428 KOps/s 28.1791 KOps/s $\textbf{\color{#35bf28}+6.97\%}$
test_add_td 95.0720μs 50.4831μs 19.8086 KOps/s 19.1080 KOps/s $\color{#35bf28}+3.67\%$
test_compile_add_one_nested[tensordict-compile] 0.1878ms 0.1201ms 8.3263 KOps/s 7.8939 KOps/s $\textbf{\color{#35bf28}+5.48\%}$
test_compile_add_one_nested[tensordict-eager] 0.2248ms 0.1298ms 7.7021 KOps/s 7.5327 KOps/s $\color{#35bf28}+2.25\%$
test_compile_add_one_nested[pytree-compile] 0.1406ms 93.3130μs 10.7166 KOps/s 10.5601 KOps/s $\color{#35bf28}+1.48\%$
test_compile_add_one_nested[pytree-eager] 1.5993ms 0.1475ms 6.7804 KOps/s 6.7753 KOps/s $\color{#35bf28}+0.08\%$
test_compile_copy_nested[tensordict-compile] 62.6210μs 29.2442μs 34.1948 KOps/s 43.4980 KOps/s $\textbf{\color{#d91a1a}-21.39\%}$
test_compile_copy_nested[tensordict-eager] 61.8810μs 29.5111μs 33.8856 KOps/s 34.4072 KOps/s $\color{#d91a1a}-1.52\%$
test_compile_copy_nested[pytree-compile] 0.4782ms 63.6880μs 15.7016 KOps/s 15.5587 KOps/s $\color{#35bf28}+0.92\%$
test_compile_copy_nested[pytree-eager] 78.0620μs 48.6781μs 20.5431 KOps/s 20.4686 KOps/s $\color{#35bf28}+0.36\%$
test_compile_add_one_flat[tensordict-compile] 0.1836ms 0.1416ms 7.0620 KOps/s 7.1414 KOps/s $\color{#d91a1a}-1.11\%$
test_compile_add_one_flat[tensordict-eager] 0.6036ms 0.2168ms 4.6125 KOps/s 4.6956 KOps/s $\color{#d91a1a}-1.77\%$
test_compile_add_one_flat[tensorclass-compile] 0.1491ms 96.7590μs 10.3350 KOps/s 10.3862 KOps/s $\color{#d91a1a}-0.49\%$
test_compile_add_one_flat[tensorclass-eager] 0.4452ms 56.2862μs 17.7664 KOps/s 18.2957 KOps/s $\color{#d91a1a}-2.89\%$
test_compile_add_one_flat[pytree-compile] 0.2548ms 0.1370ms 7.3000 KOps/s 7.3100 KOps/s $\color{#d91a1a}-0.14\%$
test_compile_add_one_flat[pytree-eager] 0.8617ms 0.4643ms 2.1537 KOps/s 2.1618 KOps/s $\color{#d91a1a}-0.38\%$
test_compile_add_self_flat[tensordict-eager] 0.6602ms 0.2624ms 3.8111 KOps/s 3.8917 KOps/s $\color{#d91a1a}-2.07\%$
test_compile_add_self_flat[tensordict-compile] 0.1847ms 0.1417ms 7.0574 KOps/s 6.9922 KOps/s $\color{#35bf28}+0.93\%$
test_compile_add_self_flat[tensorclass-eager] 0.1707ms 69.1642μs 14.4583 KOps/s 14.8332 KOps/s $\color{#d91a1a}-2.53\%$
test_compile_add_self_flat[tensorclass-compile] 0.1370ms 97.2141μs 10.2866 KOps/s 10.3624 KOps/s $\color{#d91a1a}-0.73\%$
test_compile_add_self_flat[pytree-eager] 0.5185ms 0.3934ms 2.5417 KOps/s 2.5829 KOps/s $\color{#d91a1a}-1.60\%$
test_compile_add_self_flat[pytree-compile] 0.1734ms 0.1320ms 7.5737 KOps/s 7.4496 KOps/s $\color{#35bf28}+1.66\%$
test_compile_copy_flat[tensordict-compile] 0.4021ms 18.1425μs 55.1192 KOps/s 56.0682 KOps/s $\color{#d91a1a}-1.69\%$
test_compile_copy_flat[tensordict-eager] 0.4111ms 30.8108μs 32.4562 KOps/s 32.1784 KOps/s $\color{#35bf28}+0.86\%$
test_compile_copy_flat[pytree-compile] 0.1129ms 70.3218μs 14.2203 KOps/s 14.4159 KOps/s $\color{#d91a1a}-1.36\%$
test_compile_copy_flat[pytree-eager] 0.4308ms 53.0964μs 18.8337 KOps/s 19.2983 KOps/s $\color{#d91a1a}-2.41\%$
test_compile_assign_and_add[tensordict-compile] 1.6011ms 0.3858ms 2.5918 KOps/s 2.2764 KOps/s $\textbf{\color{#35bf28}+13.86\%}$
test_compile_assign_and_add[tensordict-eager] 2.5927ms 2.5087ms 398.6159 Ops/s 401.6857 Ops/s $\color{#d91a1a}-0.76\%$
test_compile_assign_and_add[pytree-compile] 1.5623ms 0.4243ms 2.3569 KOps/s 2.1822 KOps/s $\textbf{\color{#35bf28}+8.00\%}$
test_compile_assign_and_add[pytree-eager] 2.7999ms 2.5372ms 394.1366 Ops/s 385.3130 Ops/s $\color{#35bf28}+2.29\%$
test_compile_indexing[tensor-tensordict-compile] 0.6384ms 0.1162ms 8.6032 KOps/s 8.3516 KOps/s $\color{#35bf28}+3.01\%$
test_compile_indexing[tensor-tensordict-eager] 0.5793ms 77.5716μs 12.8913 KOps/s 11.8898 KOps/s $\textbf{\color{#35bf28}+8.42\%}$
test_compile_indexing[tensor-tensorclass-compile] 0.4062ms 0.1053ms 9.4944 KOps/s 8.9732 KOps/s $\textbf{\color{#35bf28}+5.81\%}$
test_compile_indexing[tensor-tensorclass-eager] 0.1605ms 69.6714μs 14.3531 KOps/s 14.4950 KOps/s $\color{#d91a1a}-0.98\%$
test_compile_indexing[tensor-pytree-compile] 0.1683ms 0.1127ms 8.8706 KOps/s 8.9259 KOps/s $\color{#d91a1a}-0.62\%$
test_compile_indexing[tensor-pytree-eager] 0.1461ms 70.3001μs 14.2247 KOps/s 14.1781 KOps/s $\color{#35bf28}+0.33\%$
test_compile_indexing[slice-tensordict-compile] 0.1436ms 0.1036ms 9.6528 KOps/s 10.1395 KOps/s $\color{#d91a1a}-4.80\%$
test_compile_indexing[slice-tensordict-eager] 0.1569ms 18.7461μs 53.3446 KOps/s 55.8395 KOps/s $\color{#d91a1a}-4.47\%$
test_compile_indexing[slice-tensorclass-compile] 0.1509ms 0.1006ms 9.9392 KOps/s 10.4851 KOps/s $\textbf{\color{#d91a1a}-5.21\%}$
test_compile_indexing[slice-tensorclass-eager] 0.1643ms 15.8990μs 62.8972 KOps/s 63.1123 KOps/s $\color{#d91a1a}-0.34\%$
test_compile_indexing[slice-pytree-compile] 0.1719ms 0.1009ms 9.9093 KOps/s 10.1826 KOps/s $\color{#d91a1a}-2.68\%$
test_compile_indexing[slice-pytree-eager] 55.1910μs 15.8387μs 63.1363 KOps/s 64.2301 KOps/s $\color{#d91a1a}-1.70\%$
test_compile_indexing[int-tensordict-compile] 0.1418ms 99.6458μs 10.0355 KOps/s 9.9196 KOps/s $\color{#35bf28}+1.17\%$
test_compile_indexing[int-tensordict-eager] 0.5829ms 16.8287μs 59.4222 KOps/s 58.6032 KOps/s $\color{#35bf28}+1.40\%$
test_compile_indexing[int-tensorclass-compile] 0.2265ms 0.1010ms 9.9044 KOps/s 10.1046 KOps/s $\color{#d91a1a}-1.98\%$
test_compile_indexing[int-tensorclass-eager] 55.2710μs 15.7240μs 63.5971 KOps/s 63.7605 KOps/s $\color{#d91a1a}-0.26\%$
test_compile_indexing[int-pytree-compile] 0.1565ms 0.1004ms 9.9595 KOps/s 10.3886 KOps/s $\color{#d91a1a}-4.13\%$
test_compile_indexing[int-pytree-eager] 76.4510μs 15.6748μs 63.7966 KOps/s 63.1634 KOps/s $\color{#35bf28}+1.00\%$
test_mod_add[eager] 77.2610μs 38.8933μs 25.7114 KOps/s 26.1189 KOps/s $\color{#d91a1a}-1.56\%$
test_mod_add[compile] 0.2232ms 81.1361μs 12.3250 KOps/s 12.5918 KOps/s $\color{#d91a1a}-2.12\%$
test_mod_add[compile-overhead] 0.3395ms 0.1687ms 5.9291 KOps/s 5.5642 KOps/s $\textbf{\color{#35bf28}+6.56\%}$
test_mod_wrap[eager] 0.3994ms 0.2486ms 4.0220 KOps/s 3.7653 KOps/s $\textbf{\color{#35bf28}+6.82\%}$
test_mod_wrap[compile] 0.3715ms 0.2894ms 3.4558 KOps/s 3.5342 KOps/s $\color{#d91a1a}-2.22\%$
test_mod_wrap[compile-overhead] 7.4233ms 3.9453ms 253.4653 Ops/s 277.5998 Ops/s $\textbf{\color{#d91a1a}-8.69\%}$
test_mod_wrap_and_backward[eager] 1.9184ms 1.3798ms 724.7348 Ops/s 702.0287 Ops/s $\color{#35bf28}+3.23\%$
test_mod_wrap_and_backward[compile] 1.4392ms 1.3414ms 745.4941 Ops/s 727.5782 Ops/s $\color{#35bf28}+2.46\%$
test_mod_wrap_and_backward[compile-overhead] 1.5124ms 1.0169ms 983.4290 Ops/s 932.2187 Ops/s $\textbf{\color{#35bf28}+5.49\%}$
test_seq_add[eager] 0.1829ms 0.1167ms 8.5662 KOps/s 8.4213 KOps/s $\color{#35bf28}+1.72\%$
test_seq_add[compile] 0.1488ms 88.9370μs 11.2439 KOps/s 11.4798 KOps/s $\color{#d91a1a}-2.05\%$
test_seq_add[compile-overhead] 0.2410ms 0.1286ms 7.7750 KOps/s 7.8013 KOps/s $\color{#d91a1a}-0.34\%$
test_seq_wrap[eager] 0.5072ms 0.4375ms 2.2856 KOps/s 2.3320 KOps/s $\color{#d91a1a}-1.99\%$
test_seq_wrap[compile] 0.3723ms 0.2965ms 3.3727 KOps/s 3.3146 KOps/s $\color{#35bf28}+1.75\%$
test_seq_wrap[compile-overhead] 0.2996ms 0.2229ms 4.4867 KOps/s 4.4612 KOps/s $\color{#35bf28}+0.57\%$
test_func_call_runtime[False-eager] 0.7958ms 0.7303ms 1.3693 KOps/s 1.3595 KOps/s $\color{#35bf28}+0.71\%$
test_func_call_runtime[False-compile] 0.9881ms 0.7312ms 1.3676 KOps/s 1.3602 KOps/s $\color{#35bf28}+0.55\%$
test_func_call_runtime[False-compile-overhead] 0.4073ms 0.3610ms 2.7698 KOps/s 2.7772 KOps/s $\color{#d91a1a}-0.27\%$
test_func_call_runtime[True-eager] 0.9542ms 0.8886ms 1.1254 KOps/s 1.1106 KOps/s $\color{#35bf28}+1.34\%$
test_func_call_runtime[True-compile] 0.8265ms 0.7570ms 1.3210 KOps/s 1.3164 KOps/s $\color{#35bf28}+0.34\%$
test_func_call_runtime[True-compile-overhead] 0.4403ms 0.3834ms 2.6082 KOps/s 2.6369 KOps/s $\color{#d91a1a}-1.09\%$
test_func_call_cm_runtime[False-eager] 0.7741ms 0.7192ms 1.3904 KOps/s 1.3669 KOps/s $\color{#35bf28}+1.73\%$
test_func_call_cm_runtime[False-compile] 0.7928ms 0.7341ms 1.3622 KOps/s 1.3568 KOps/s $\color{#35bf28}+0.40\%$
test_func_call_cm_runtime[False-compile-overhead] 0.4516ms 0.3618ms 2.7640 KOps/s 2.7667 KOps/s $\color{#d91a1a}-0.10\%$
test_func_call_cm_runtime[True-eager] 1.1178ms 0.9944ms 1.0056 KOps/s 993.4817 Ops/s $\color{#35bf28}+1.22\%$
test_func_call_cm_runtime[True-compile] 1.0772ms 0.9744ms 1.0263 KOps/s 973.4933 Ops/s $\textbf{\color{#35bf28}+5.42\%}$
test_func_call_cm_runtime[True-compile-overhead] 1.1730ms 0.9863ms 1.0138 KOps/s 1.0024 KOps/s $\color{#35bf28}+1.14\%$
test_vmap_func_call_cm_runtime[eager] 2.4946ms 2.0863ms 479.3280 Ops/s 481.5663 Ops/s $\color{#d91a1a}-0.46\%$
test_vmap_func_call_cm_runtime[compile] 0.9932ms 0.8356ms 1.1967 KOps/s 1.2571 KOps/s $\color{#d91a1a}-4.81\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.4721ms 0.4139ms 2.4163 KOps/s 2.4258 KOps/s $\color{#d91a1a}-0.39\%$
test_distributed 2.8847ms 0.2818ms 3.5485 KOps/s 8.6967 KOps/s $\textbf{\color{#d91a1a}-59.20\%}$
test_tdmodule 37.3100μs 22.4014μs 44.6400 KOps/s 47.1617 KOps/s $\textbf{\color{#d91a1a}-5.35\%}$
test_tdmodule_dispatch 0.2322ms 39.8361μs 25.1029 KOps/s 26.7031 KOps/s $\textbf{\color{#d91a1a}-5.99\%}$
test_tdseq 42.4010μs 22.3333μs 44.7762 KOps/s 46.8204 KOps/s $\color{#d91a1a}-4.37\%$
test_tdseq_dispatch 66.6410μs 41.5287μs 24.0797 KOps/s 25.0002 KOps/s $\color{#d91a1a}-3.68\%$
test_instantiation_functorch 1.8169ms 1.5103ms 662.1158 Ops/s 644.0760 Ops/s $\color{#35bf28}+2.80\%$
test_exec_functorch 0.1759ms 0.1399ms 7.1496 KOps/s 6.9078 KOps/s $\color{#35bf28}+3.50\%$
test_exec_functional_call 0.1829ms 0.1315ms 7.6062 KOps/s 7.2346 KOps/s $\textbf{\color{#35bf28}+5.14\%}$
test_exec_td_decorator 0.3656ms 0.1813ms 5.5171 KOps/s 5.3271 KOps/s $\color{#35bf28}+3.57\%$
test_vmap_mlp_speed_decorator[True-True] 0.8434ms 0.6936ms 1.4418 KOps/s 1.4737 KOps/s $\color{#d91a1a}-2.16\%$
test_vmap_mlp_speed_decorator[True-False] 0.8069ms 0.6862ms 1.4572 KOps/s 1.4724 KOps/s $\color{#d91a1a}-1.03\%$
test_vmap_mlp_speed_decorator[False-True] 0.7289ms 0.5953ms 1.6799 KOps/s 1.7122 KOps/s $\color{#d91a1a}-1.88\%$
test_vmap_mlp_speed_decorator[False-False] 0.7077ms 0.5922ms 1.6886 KOps/s 1.7044 KOps/s $\color{#d91a1a}-0.93\%$
test_vmap_transformer_speed_decorator[True-True] 20.0170ms 19.3777ms 51.6058 Ops/s 53.1433 Ops/s $\color{#d91a1a}-2.89\%$
test_vmap_transformer_speed_decorator[True-False] 20.1294ms 19.3573ms 51.6600 Ops/s 53.1043 Ops/s $\color{#d91a1a}-2.72\%$
test_vmap_transformer_speed_decorator[False-True] 19.9004ms 19.1882ms 52.1154 Ops/s 53.5428 Ops/s $\color{#d91a1a}-2.67\%$
test_vmap_transformer_speed_decorator[False-False] 19.6122ms 19.0202ms 52.5758 Ops/s 53.6541 Ops/s $\color{#d91a1a}-2.01\%$
test_to_module_speed[True] 1.4704ms 0.9613ms 1.0403 KOps/s 1.0363 KOps/s $\color{#35bf28}+0.39\%$
test_to_module_speed[False] 1.0314ms 0.9465ms 1.0565 KOps/s 1.0582 KOps/s $\color{#d91a1a}-0.16\%$
test_tc_init 68.7310μs 36.8840μs 27.1120 KOps/s 27.7665 KOps/s $\color{#d91a1a}-2.36\%$
test_tc_init_nested 0.1126ms 74.2653μs 13.4652 KOps/s 13.8627 KOps/s $\color{#d91a1a}-2.87\%$
test_tc_first_layer_tensor 21.1800μs 0.8028μs 1.2456 MOps/s 1.2496 MOps/s $\color{#d91a1a}-0.32\%$
test_tc_first_layer_nontensor 24.0400μs 2.2279μs 448.8475 KOps/s 448.5806 KOps/s $\color{#35bf28}+0.06\%$
test_tc_second_layer_tensor 10.6003μs 1.4171μs 705.6680 KOps/s 711.2167 KOps/s $\color{#d91a1a}-0.78\%$
test_tc_second_layer_nontensor 27.1610μs 2.9344μs 340.7850 KOps/s 339.7535 KOps/s $\color{#35bf28}+0.30\%$
test_unbind 0.2236s 12.2128ms 81.8812 Ops/s 143.6831 Ops/s $\textbf{\color{#d91a1a}-43.01\%}$
test_full_like 10.8512ms 9.6985ms 103.1082 Ops/s 101.2308 Ops/s $\color{#35bf28}+1.85\%$
test_zeros_like 9.4350ms 7.3612ms 135.8468 Ops/s 227.6491 Ops/s $\textbf{\color{#d91a1a}-40.33\%}$
test_ones_like 4.9592ms 4.4258ms 225.9467 Ops/s 226.4821 Ops/s $\color{#d91a1a}-0.24\%$
test_clone 12.3900ms 9.4592ms 105.7175 Ops/s 148.1773 Ops/s $\textbf{\color{#d91a1a}-28.65\%}$
test_squeeze 60.2210μs 9.7124μs 102.9608 KOps/s 102.7055 KOps/s $\color{#35bf28}+0.25\%$
test_unsqueeze 0.1297ms 72.8661μs 13.7238 KOps/s 13.5617 KOps/s $\color{#35bf28}+1.20\%$
test_split 0.3620ms 0.1547ms 6.4635 KOps/s 6.2375 KOps/s $\color{#35bf28}+3.62\%$
test_permute 0.2768ms 0.1789ms 5.5907 KOps/s 5.4153 KOps/s $\color{#35bf28}+3.24\%$
test_stack 52.2232ms 51.3137ms 19.4880 Ops/s 19.5148 Ops/s $\color{#d91a1a}-0.14\%$
test_cat 52.4159ms 51.1281ms 19.5587 Ops/s 19.5865 Ops/s $\color{#d91a1a}-0.14\%$

else:
keys: set[str] = set(keys)
keys_set: set[str] = set(keys)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, is it much more efficient using set rather than using always the other option? Or there is another reason?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

torch.compile used to not understand set() that's all. I should check if it's still the case

@@ -626,7 +626,6 @@ def stack_fn(key, values, is_not_init, is_tensor):
key: stack_fn(key, values, is_not_init, is_tensor)
for key, (values, is_not_init, is_tensor) in out.items()
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this added space is on purpose.

raise KeyError(
f"got keys {keys} and {set(td.keys())} which are incompatible"
)
return keys
if strict:
return keys
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return keys
return list(keys)

pretty sure that's what you mean with your comment, but just to be on the safe side. Rn, the return type is not consistent with typing.

return keys
if strict:
return keys
return keys_set
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By curiosity, what are the downstream functions that would be impacted by this? In other words, in which context is _check_keys(strict=False) used?

tc1 = MyTensorClass(foo=torch.zeros((1,)), bar=torch.ones((1,)))

for _ in range(10000):
assert list(torch.stack([tc1, tc1], dim=0)._tensordict.keys()) == [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
assert list(torch.stack([tc1, tc1], dim=0)._tensordict.keys()) == [
assert list(torch.stack([tc1, tc1], dim=0).keys()) == [

Copy link
Contributor Author

@vmoens vmoens Feb 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was on purpose to avoid any artifacts caused by @tensorclass (if there had been any)

@vmoens vmoens added the bug Something isn't working label Feb 24, 2025
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Feb 24, 2025
ghstack-source-id: a46518942d70508046c27351a68580e3957b0371
Pull Request resolved: #1230
@vmoens vmoens merged commit cd695ad into gh/vmoens/48/base Feb 24, 2025
22 of 35 checks passed
@vmoens vmoens deleted the gh/vmoens/48/head branch February 24, 2025 07:17
vmoens added a commit that referenced this pull request Feb 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants