-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BugFix] Fix non-deterministic key order in stack #1230
Conversation
ghstack-source-id: 7f394789b783d6359a78a300aaf449eb25adb5e3 Pull Request resolved: #1230
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 43.2110μs | 20.4436μs | 48.9150 KOps/s | 48.7299 KOps/s | |
test_plain_set_stack_nested | 50.6950μs | 20.8489μs | 47.9642 KOps/s | 47.9784 KOps/s | |
test_plain_set_nested_inplace | 0.1050ms | 22.2425μs | 44.9590 KOps/s | 44.6018 KOps/s | |
test_plain_set_stack_nested_inplace | 0.1024ms | 22.1717μs | 45.1026 KOps/s | 44.0466 KOps/s | |
test_items | 18.6050μs | 4.0844μs | 244.8367 KOps/s | 243.1780 KOps/s | |
test_items_nested | 0.6702ms | 0.4055ms | 2.4660 KOps/s | 2.4693 KOps/s | |
test_items_nested_locked | 0.6818ms | 0.4027ms | 2.4832 KOps/s | 2.4654 KOps/s | |
test_items_nested_leaf | 0.1840ms | 75.1847μs | 13.3006 KOps/s | 13.0728 KOps/s | |
test_items_stack_nested | 0.7141ms | 0.4055ms | 2.4662 KOps/s | 2.4575 KOps/s | |
test_items_stack_nested_leaf | 0.1354ms | 76.3399μs | 13.0993 KOps/s | 12.7170 KOps/s | |
test_items_stack_nested_locked | 0.5948ms | 0.4030ms | 2.4812 KOps/s | 2.4415 KOps/s | |
test_keys | 40.4360μs | 3.4818μs | 287.2101 KOps/s | 290.9562 KOps/s | |
test_keys_nested | 0.3000ms | 0.1635ms | 6.1147 KOps/s | 5.9699 KOps/s | |
test_keys_nested_locked | 1.8654ms | 0.1711ms | 5.8455 KOps/s | 5.8387 KOps/s | |
test_keys_nested_leaf | 0.2666ms | 0.1426ms | 7.0137 KOps/s | 6.7610 KOps/s | |
test_keys_stack_nested | 0.2633ms | 0.1620ms | 6.1741 KOps/s | 6.1400 KOps/s | |
test_keys_stack_nested_leaf | 0.2572ms | 0.1429ms | 6.9999 KOps/s | 7.0285 KOps/s | |
test_keys_stack_nested_locked | 0.2537ms | 0.1693ms | 5.9066 KOps/s | 5.8518 KOps/s | |
test_values | 8.7042μs | 1.0465μs | 955.5870 KOps/s | 980.5448 KOps/s | |
test_values_nested | 0.1368ms | 62.2929μs | 16.0532 KOps/s | 15.8145 KOps/s | |
test_values_nested_locked | 0.1396ms | 64.0850μs | 15.6043 KOps/s | 15.9154 KOps/s | |
test_values_nested_leaf | 0.1084ms | 70.8036μs | 14.1236 KOps/s | 13.9612 KOps/s | |
test_values_stack_nested | 0.1075ms | 61.8358μs | 16.1719 KOps/s | 15.8101 KOps/s | |
test_values_stack_nested_leaf | 0.1343ms | 71.7092μs | 13.9452 KOps/s | 13.6844 KOps/s | |
test_values_stack_nested_locked | 0.1175ms | 61.8034μs | 16.1803 KOps/s | 14.2689 KOps/s | |
test_membership | 13.4860μs | 0.8556μs | 1.1688 MOps/s | 1.1658 MOps/s | |
test_membership_nested | 0.1267ms | 2.9733μs | 336.3279 KOps/s | 347.6711 KOps/s | |
test_membership_nested_leaf | 43.0100μs | 2.8839μs | 346.7516 KOps/s | 345.1824 KOps/s | |
test_membership_stacked_nested | 20.2370μs | 2.8350μs | 352.7368 KOps/s | 350.6887 KOps/s | |
test_membership_stacked_nested_leaf | 23.7350μs | 2.8508μs | 350.7807 KOps/s | 350.4104 KOps/s | |
test_membership_nested_last | 40.3360μs | 4.2915μs | 233.0208 KOps/s | 230.2049 KOps/s | |
test_membership_nested_leaf_last | 53.6390μs | 4.3530μs | 229.7269 KOps/s | 227.0836 KOps/s | |
test_membership_stacked_nested_last | 65.3580μs | 4.2370μs | 236.0181 KOps/s | 163.0227 KOps/s | |
test_membership_stacked_nested_leaf_last | 26.0480μs | 4.2760μs | 233.8611 KOps/s | 163.6846 KOps/s | |
test_nested_getleaf | 0.1174ms | 10.6863μs | 93.5776 KOps/s | 96.1400 KOps/s | |
test_nested_get | 67.1050μs | 10.0320μs | 99.6808 KOps/s | 100.1385 KOps/s | |
test_stacked_getleaf | 55.9140μs | 10.4238μs | 95.9342 KOps/s | 95.0175 KOps/s | |
test_stacked_get | 57.4580μs | 9.9430μs | 100.5733 KOps/s | 101.4031 KOps/s | |
test_nested_getitemleaf | 49.0510μs | 11.0856μs | 90.2073 KOps/s | 89.4288 KOps/s | |
test_nested_getitem | 53.0480μs | 10.7146μs | 93.3306 KOps/s | 93.2182 KOps/s | |
test_stacked_getitemleaf | 76.4720μs | 11.0340μs | 90.6287 KOps/s | 89.6607 KOps/s | |
test_stacked_getitem | 48.7910μs | 10.5963μs | 94.3725 KOps/s | 93.2167 KOps/s | |
test_lock_nested | 0.6604ms | 0.4121ms | 2.4268 KOps/s | 2.3660 KOps/s | |
test_lock_stack_nested | 0.8698ms | 0.4277ms | 2.3382 KOps/s | 2.3118 KOps/s | |
test_unlock_nested | 0.5307ms | 0.3404ms | 2.9375 KOps/s | 2.8297 KOps/s | |
test_unlock_stack_nested | 0.6680ms | 0.3457ms | 2.8929 KOps/s | 2.8521 KOps/s | |
test_flatten_speed | 0.2052ms | 99.1298μs | 10.0878 KOps/s | 10.0768 KOps/s | |
test_unflatten_speed | 0.8703ms | 0.5147ms | 1.9427 KOps/s | 1.9216 KOps/s | |
test_common_ops | 1.0171ms | 0.8069ms | 1.2393 KOps/s | 1.2062 KOps/s | |
test_creation | 24.5360μs | 2.4649μs | 405.6976 KOps/s | 403.1829 KOps/s | |
test_creation_empty | 55.2330μs | 12.0572μs | 82.9379 KOps/s | 80.7407 KOps/s | |
test_creation_nested_1 | 48.6010μs | 14.9599μs | 66.8453 KOps/s | 65.8944 KOps/s | |
test_creation_nested_2 | 71.2430μs | 19.3299μs | 51.7334 KOps/s | 49.1155 KOps/s | |
test_clone | 0.1324ms | 13.8149μs | 72.3856 KOps/s | 74.0819 KOps/s | |
test_getitem[int] | 0.9111ms | 12.6159μs | 79.2649 KOps/s | 75.1468 KOps/s | |
test_getitem[slice_int] | 0.1437ms | 24.2870μs | 41.1743 KOps/s | 39.6051 KOps/s | |
test_getitem[range] | 0.2091ms | 49.6223μs | 20.1522 KOps/s | 19.0930 KOps/s | |
test_getitem[tuple] | 0.1613ms | 21.1857μs | 47.2016 KOps/s | 48.0793 KOps/s | |
test_getitem[list] | 0.1951ms | 45.0990μs | 22.1734 KOps/s | 21.3076 KOps/s | |
test_setitem_dim[int] | 59.0400μs | 24.4546μs | 40.8922 KOps/s | 37.8227 KOps/s | |
test_setitem_dim[slice_int] | 0.1004ms | 50.3718μs | 19.8524 KOps/s | 19.4817 KOps/s | |
test_setitem_dim[range] | 0.1297ms | 75.7385μs | 13.2033 KOps/s | 12.8687 KOps/s | |
test_setitem_dim[tuple] | 76.7430μs | 40.0203μs | 24.9873 KOps/s | 23.9678 KOps/s | |
test_setitem | 95.4470μs | 20.9946μs | 47.6312 KOps/s | 46.7676 KOps/s | |
test_set | 93.0330μs | 20.3484μs | 49.1440 KOps/s | 48.1604 KOps/s | |
test_set_shared | 4.7566ms | 0.1841ms | 5.4322 KOps/s | 5.3877 KOps/s | |
test_update | 0.1518ms | 23.3543μs | 42.8187 KOps/s | 41.2737 KOps/s | |
test_update_nested | 0.1111ms | 34.2653μs | 29.1841 KOps/s | 28.7382 KOps/s | |
test_update__nested | 0.5138ms | 33.6621μs | 29.7070 KOps/s | 29.7508 KOps/s | |
test_set_nested | 0.1439ms | 22.5571μs | 44.3319 KOps/s | 44.1027 KOps/s | |
test_set_nested_new | 0.1023ms | 26.6810μs | 37.4799 KOps/s | 35.9929 KOps/s | |
test_select | 0.1101ms | 42.1049μs | 23.7502 KOps/s | 22.3412 KOps/s | |
test_select_nested | 0.1192ms | 62.5018μs | 15.9995 KOps/s | 15.9605 KOps/s | |
test_exclude_nested | 0.1451ms | 79.7787μs | 12.5347 KOps/s | 12.3764 KOps/s | |
test_empty[True] | 0.7430ms | 0.3999ms | 2.5004 KOps/s | 2.4816 KOps/s | |
test_empty[False] | 22.2140μs | 1.3567μs | 737.1051 KOps/s | 719.9135 KOps/s | |
test_unbind_speed | 0.3812ms | 0.2715ms | 3.6833 KOps/s | 3.5002 KOps/s | |
test_unbind_speed_stack0 | 0.4478ms | 0.2648ms | 3.7765 KOps/s | 3.6231 KOps/s | |
test_unbind_speed_stack1 | 0.1177s | 0.7460ms | 1.3405 KOps/s | 1.3271 KOps/s | |
test_split | 0.1182s | 1.7991ms | 555.8222 Ops/s | 551.8549 Ops/s | |
test_chunk | 0.1167s | 1.7662ms | 566.1730 Ops/s | 547.5572 Ops/s | |
test_consolidate_njt[False-None] | 8.5246ms | 7.9534ms | 125.7317 Ops/s | 119.6020 Ops/s | |
test_creation[device0] | 0.2937ms | 92.0728μs | 10.8610 KOps/s | 10.7947 KOps/s | |
test_creation_from_tensor | 3.8100ms | 95.9512μs | 10.4220 KOps/s | 10.2688 KOps/s | |
test_add_one[memmap_tensor0] | 0.1087ms | 5.2188μs | 191.6149 KOps/s | 194.1821 KOps/s | |
test_contiguous[memmap_tensor0] | 12.5630μs | 0.5042μs | 1.9834 MOps/s | 2.0139 MOps/s | |
test_stack[memmap_tensor0] | 19.2460μs | 3.4329μs | 291.3028 KOps/s | 284.3938 KOps/s | |
test_memmaptd_index | 1.4763ms | 0.2296ms | 4.3558 KOps/s | 4.2514 KOps/s | |
test_memmaptd_index_astensor | 0.6724ms | 0.3134ms | 3.1912 KOps/s | 3.1079 KOps/s | |
test_memmaptd_index_op | 1.0234ms | 0.5981ms | 1.6719 KOps/s | 1.6108 KOps/s | |
test_serialize_model | 0.2338s | 0.1374s | 7.2762 Ops/s | 7.6736 Ops/s | |
test_serialize_model_pickle | 0.4495s | 0.3961s | 2.5249 Ops/s | 2.5356 Ops/s | |
test_serialize_weights | 0.1212s | 0.1146s | 8.7250 Ops/s | 8.7044 Ops/s | |
test_serialize_weights_returnearly | 0.1714s | 0.1589s | 6.2919 Ops/s | 6.2357 Ops/s | |
test_serialize_weights_pickle | 1.2488s | 0.7093s | 1.4098 Ops/s | 2.5169 Ops/s | |
test_serialize_weights_filesystem | 0.1554s | 0.1414s | 7.0744 Ops/s | 6.1594 Ops/s | |
test_serialize_model_filesystem | 0.1477s | 0.1425s | 7.0179 Ops/s | 6.5015 Ops/s | |
test_reshape_pytree | 57.1660μs | 26.1035μs | 38.3091 KOps/s | 37.8446 KOps/s | |
test_reshape_td | 66.4440μs | 32.4738μs | 30.7940 KOps/s | 30.4783 KOps/s | |
test_view_pytree | 79.4180μs | 26.1308μs | 38.2690 KOps/s | 38.3334 KOps/s | |
test_view_td | 86.0710μs | 40.6592μs | 24.5947 KOps/s | 24.1626 KOps/s | |
test_unbind_pytree | 68.3480μs | 28.8901μs | 34.6139 KOps/s | 33.6925 KOps/s | |
test_unbind_td | 0.3108ms | 39.1480μs | 25.5441 KOps/s | 23.9361 KOps/s | |
test_split_pytree | 66.6740μs | 28.5756μs | 34.9949 KOps/s | 34.4071 KOps/s | |
test_split_td | 0.5000ms | 44.8492μs | 22.2969 KOps/s | 21.5978 KOps/s | |
test_add_pytree | 76.2620μs | 35.5719μs | 28.1121 KOps/s | 27.0514 KOps/s | |
test_add_td | 0.1266ms | 57.6482μs | 17.3466 KOps/s | 16.7729 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1146ms | 66.3548μs | 15.0705 KOps/s | 15.0427 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 1.5399ms | 0.1729ms | 5.7822 KOps/s | 5.8524 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1212ms | 45.8812μs | 21.7954 KOps/s | 22.0147 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.1830ms | 0.1199ms | 8.3396 KOps/s | 8.3201 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 69.0990μs | 29.1794μs | 34.2707 KOps/s | 35.7930 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1134ms | 57.6474μs | 17.3468 KOps/s | 16.9820 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1360ms | 78.3757μs | 12.7591 KOps/s | 12.2604 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1514ms | 65.6367μs | 15.2354 KOps/s | 14.9302 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.1830ms | 0.1081ms | 9.2508 KOps/s | 9.3163 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.2827ms | 0.2159ms | 4.6321 KOps/s | 4.6412 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1148ms | 47.9894μs | 20.8379 KOps/s | 21.5197 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1379ms | 68.8268μs | 14.5292 KOps/s | 15.0282 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.1741ms | 0.1021ms | 9.7960 KOps/s | 10.0180 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.3641ms | 0.2035ms | 4.9138 KOps/s | 4.8948 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.4243ms | 0.2312ms | 4.3244 KOps/s | 4.3348 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.1904ms | 0.1085ms | 9.2141 KOps/s | 9.2127 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1700ms | 65.7076μs | 15.2190 KOps/s | 15.7934 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1009ms | 48.7201μs | 20.5254 KOps/s | 20.6954 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.2948ms | 0.1580ms | 6.3286 KOps/s | 6.2925 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.1847ms | 0.1032ms | 9.6942 KOps/s | 9.9714 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 52.4880μs | 22.3563μs | 44.7301 KOps/s | 46.9171 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1292ms | 66.6908μs | 14.9946 KOps/s | 14.9486 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1621ms | 82.4780μs | 12.1244 KOps/s | 12.2005 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1424ms | 68.1572μs | 14.6720 KOps/s | 14.9660 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.3389ms | 0.2173ms | 4.6024 KOps/s | 4.6935 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 1.6796ms | 1.3869ms | 721.0552 Ops/s | 715.6496 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.3320ms | 0.2097ms | 4.7687 KOps/s | 4.7844 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 0.8944ms | 0.8278ms | 1.2080 KOps/s | 1.1866 KOps/s | |
test_compile_assign_and_add_stack[compile] | 0.5659ms | 0.4601ms | 2.1737 KOps/s | 2.1908 KOps/s | |
test_compile_assign_and_add_stack[eager] | 5.9013ms | 2.7844ms | 359.1436 Ops/s | 356.1348 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.1029ms | 39.5588μs | 25.2788 KOps/s | 26.1761 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.5805ms | 33.4850μs | 29.8641 KOps/s | 28.9679 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 70.3810μs | 31.2231μs | 32.0275 KOps/s | 32.1699 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 81.4120μs | 22.9394μs | 43.5931 KOps/s | 43.6558 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 87.5940μs | 32.6653μs | 30.6135 KOps/s | 31.7440 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 60.4530μs | 22.8923μs | 43.6828 KOps/s | 43.5594 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1143ms | 54.4876μs | 18.3528 KOps/s | 18.8602 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.4794ms | 19.8321μs | 50.4233 KOps/s | 47.4330 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.3071ms | 46.7843μs | 21.3747 KOps/s | 21.6566 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 86.0000μs | 18.4432μs | 54.2206 KOps/s | 52.9776 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1178ms | 47.3489μs | 21.1198 KOps/s | 21.0983 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 81.2710μs | 18.5560μs | 53.8908 KOps/s | 53.0825 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1234ms | 55.0071μs | 18.1795 KOps/s | 18.2946 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.8346ms | 19.5004μs | 51.2810 KOps/s | 47.6374 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 91.4210μs | 46.3826μs | 21.5598 KOps/s | 21.3290 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 0.2501ms | 18.5885μs | 53.7968 KOps/s | 53.4589 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1130ms | 47.0299μs | 21.2631 KOps/s | 21.3329 KOps/s | |
test_compile_indexing[int-pytree-eager] | 52.2570μs | 18.8541μs | 53.0389 KOps/s | 52.3639 KOps/s | |
test_mod_add[eager] | 0.2291ms | 36.6435μs | 27.2900 KOps/s | 27.2470 KOps/s | |
test_mod_add[compile] | 0.1775ms | 66.4681μs | 15.0448 KOps/s | 15.1585 KOps/s | |
test_mod_add[compile-overhead] | 0.1194ms | 65.1540μs | 15.3483 KOps/s | 15.4324 KOps/s | |
test_mod_wrap[eager] | 0.3798ms | 0.2248ms | 4.4474 KOps/s | 4.2733 KOps/s | |
test_mod_wrap[compile] | 1.9177ms | 0.2318ms | 4.3138 KOps/s | 4.3156 KOps/s | |
test_mod_wrap[compile-overhead] | 0.4157ms | 0.2292ms | 4.3628 KOps/s | 4.3763 KOps/s | |
test_mod_wrap_and_backward[eager] | 19.2336ms | 14.0778ms | 71.0340 Ops/s | 75.9768 Ops/s | |
test_mod_wrap_and_backward[compile] | 15.2489ms | 11.8385ms | 84.4699 Ops/s | 85.7320 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 15.5252ms | 11.9681ms | 83.5553 Ops/s | 86.7800 Ops/s | |
test_seq_add[eager] | 0.2171ms | 0.1168ms | 8.5615 KOps/s | 8.0925 KOps/s | |
test_seq_add[compile] | 0.1418ms | 79.4022μs | 12.5941 KOps/s | 12.6755 KOps/s | |
test_seq_add[compile-overhead] | 0.1416ms | 77.0098μs | 12.9854 KOps/s | 13.2641 KOps/s | |
test_seq_wrap[eager] | 0.6130ms | 0.4506ms | 2.2192 KOps/s | 2.1705 KOps/s | |
test_seq_wrap[compile] | 0.4527ms | 0.2436ms | 4.1056 KOps/s | 4.0610 KOps/s | |
test_seq_wrap[compile-overhead] | 0.4525ms | 0.2426ms | 4.1224 KOps/s | 4.0874 KOps/s | |
test_func_call_runtime[False-eager] | 1.0044ms | 0.5307ms | 1.8845 KOps/s | 1.7915 KOps/s | |
test_func_call_runtime[False-compile] | 0.9337ms | 0.4602ms | 2.1728 KOps/s | 2.2192 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.8251ms | 0.4598ms | 2.1751 KOps/s | 2.2292 KOps/s | |
test_func_call_runtime[True-eager] | 0.9423ms | 0.7446ms | 1.3431 KOps/s | 1.3019 KOps/s | |
test_func_call_runtime[True-compile] | 0.6717ms | 0.4783ms | 2.0907 KOps/s | 2.1087 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.8781ms | 0.4824ms | 2.0729 KOps/s | 2.1001 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.8341ms | 0.5267ms | 1.8986 KOps/s | 1.8372 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.5903ms | 0.4594ms | 2.1768 KOps/s | 2.2154 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.6116ms | 0.4542ms | 2.2016 KOps/s | 2.2146 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.0943ms | 0.8919ms | 1.1213 KOps/s | 1.0957 KOps/s | |
test_func_call_cm_runtime[True-compile] | 1.2934ms | 0.7978ms | 1.2535 KOps/s | 1.2238 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 1.1482ms | 0.8087ms | 1.2366 KOps/s | 1.2195 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.7302ms | 1.9641ms | 509.1433 Ops/s | 508.6258 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.8817ms | 0.5459ms | 1.8317 KOps/s | 1.8303 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 1.0548ms | 0.5337ms | 1.8736 KOps/s | 1.8312 KOps/s | |
test_distributed | 1.8376ms | 0.1275ms | 7.8425 KOps/s | 7.6887 KOps/s | |
test_tdmodule | 48.7410μs | 27.2435μs | 36.7060 KOps/s | 34.6641 KOps/s | |
test_tdmodule_dispatch | 76.1920μs | 50.6658μs | 19.7372 KOps/s | 19.4587 KOps/s | |
test_tdseq | 68.8480μs | 28.7305μs | 34.8063 KOps/s | 31.1163 KOps/s | |
test_tdseq_dispatch | 0.1009ms | 55.6459μs | 17.9708 KOps/s | 17.2826 KOps/s | |
test_instantiation_functorch | 1.8158ms | 1.5294ms | 653.8487 Ops/s | 639.7495 Ops/s | |
test_exec_functorch | 0.3219ms | 0.1785ms | 5.6007 KOps/s | 5.5059 KOps/s | |
test_exec_functional_call | 0.4284ms | 0.1758ms | 5.6899 KOps/s | 5.5560 KOps/s | |
test_exec_td_decorator | 0.5529ms | 0.2358ms | 4.2410 KOps/s | 4.2077 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.8864ms | 0.6622ms | 1.5101 KOps/s | 1.4624 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.9552ms | 0.6588ms | 1.5179 KOps/s | 1.4817 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7756ms | 0.5341ms | 1.8724 KOps/s | 1.8520 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.8406ms | 0.5331ms | 1.8758 KOps/s | 1.8448 KOps/s | |
test_to_module_speed[True] | 1.7920ms | 1.3460ms | 742.9190 Ops/s | 748.9401 Ops/s | |
test_to_module_speed[False] | 1.8161ms | 1.3093ms | 763.7872 Ops/s | 762.5334 Ops/s | |
test_tc_init | 88.1240μs | 47.0230μs | 21.2662 KOps/s | 21.5673 KOps/s | |
test_tc_init_nested | 0.1776ms | 92.7126μs | 10.7860 KOps/s | 10.8421 KOps/s | |
test_tc_first_layer_tensor | 24.2360μs | 1.5244μs | 656.0125 KOps/s | 657.4953 KOps/s | |
test_tc_first_layer_nontensor | 37.2490μs | 4.6631μs | 214.4512 KOps/s | 212.2381 KOps/s | |
test_tc_second_layer_tensor | 41.4980μs | 2.8347μs | 352.7658 KOps/s | 351.7544 KOps/s | |
test_tc_second_layer_nontensor | 36.1280μs | 5.9795μs | 167.2385 KOps/s | 167.5207 KOps/s | |
test_unbind | 0.2496s | 13.7236ms | 72.8674 Ops/s | 61.0796 Ops/s | |
test_full_like | 9.8533ms | 8.9375ms | 111.8884 Ops/s | 110.4715 Ops/s | |
test_zeros_like | 4.5841ms | 3.1326ms | 319.2235 Ops/s | 291.7101 Ops/s | |
test_ones_like | 4.7212ms | 3.4424ms | 290.4918 Ops/s | 266.1826 Ops/s | |
test_clone | 6.7760ms | 5.4144ms | 184.6943 Ops/s | 170.7657 Ops/s | |
test_squeeze | 70.6020μs | 12.6189μs | 79.2464 KOps/s | 77.8174 KOps/s | |
test_unsqueeze | 0.1688ms | 93.3263μs | 10.7151 KOps/s | 10.4098 KOps/s | |
test_split | 0.4892ms | 0.1929ms | 5.1832 KOps/s | 5.0210 KOps/s | |
test_permute | 0.3776ms | 0.2024ms | 4.9417 KOps/s | 4.8355 KOps/s | |
test_stack | 34.2332ms | 25.8088ms | 38.7465 Ops/s | 37.7087 Ops/s | |
test_cat | 33.7938ms | 25.8652ms | 38.6620 Ops/s | 38.1060 Ops/s |
tensordict/utils.py
Outdated
raise KeyError( | ||
f"got keys {keys} and {set(td.keys())} which are incompatible" | ||
) | ||
return keys | ||
if strict: | ||
return keys |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should actually make it a list
return keys | ||
if strict: | ||
return keys | ||
return keys_set |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If keys can be exclusive, their order becomes arbitrary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By curiosity, what are the downstream functions that would be impacted by this? In other words, in which context is _check_keys(strict=False)
used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes when using lazy stacks iirc
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 28.2000μs | 13.0047μs | 76.8955 KOps/s | 79.2964 KOps/s | |
test_plain_set_stack_nested | 43.7810μs | 13.1528μs | 76.0296 KOps/s | 78.1436 KOps/s | |
test_plain_set_nested_inplace | 0.4074ms | 14.2059μs | 70.3932 KOps/s | 72.7399 KOps/s | |
test_plain_set_stack_nested_inplace | 0.3997ms | 14.0612μs | 71.1177 KOps/s | 72.8358 KOps/s | |
test_items | 28.7600μs | 2.9000μs | 344.8305 KOps/s | 340.4051 KOps/s | |
test_items_nested | 0.7489ms | 0.3676ms | 2.7201 KOps/s | 2.7614 KOps/s | |
test_items_nested_locked | 0.7446ms | 0.3634ms | 2.7518 KOps/s | 2.7564 KOps/s | |
test_items_nested_leaf | 0.4461ms | 60.4382μs | 16.5458 KOps/s | 16.5079 KOps/s | |
test_items_stack_nested | 0.7494ms | 0.3619ms | 2.7635 KOps/s | 2.7834 KOps/s | |
test_items_stack_nested_leaf | 94.4420μs | 60.5620μs | 16.5120 KOps/s | 16.4225 KOps/s | |
test_items_stack_nested_locked | 0.7610ms | 0.3676ms | 2.7205 KOps/s | 2.7492 KOps/s | |
test_keys | 0.3879ms | 3.4944μs | 286.1681 KOps/s | 291.5206 KOps/s | |
test_keys_nested | 0.4714ms | 88.6287μs | 11.2830 KOps/s | 11.4948 KOps/s | |
test_keys_nested_locked | 0.8432ms | 94.2892μs | 10.6057 KOps/s | 10.8126 KOps/s | |
test_keys_nested_leaf | 0.1046ms | 79.4801μs | 12.5818 KOps/s | 12.8115 KOps/s | |
test_keys_stack_nested | 0.4741ms | 87.7444μs | 11.3967 KOps/s | 11.5249 KOps/s | |
test_keys_stack_nested_leaf | 0.4623ms | 79.2761μs | 12.6141 KOps/s | 12.8244 KOps/s | |
test_keys_stack_nested_locked | 0.4885ms | 93.5064μs | 10.6945 KOps/s | 10.8617 KOps/s | |
test_values | 64.1110μs | 0.8549μs | 1.1698 MOps/s | 1.1566 MOps/s | |
test_values_nested | 0.4281ms | 37.1446μs | 26.9218 KOps/s | 27.3140 KOps/s | |
test_values_nested_locked | 0.4279ms | 39.1748μs | 25.5266 KOps/s | 25.9346 KOps/s | |
test_values_nested_leaf | 0.4304ms | 42.4965μs | 23.5313 KOps/s | 23.7715 KOps/s | |
test_values_stack_nested | 0.4228ms | 37.6970μs | 26.5273 KOps/s | 27.1806 KOps/s | |
test_values_stack_nested_leaf | 71.4210μs | 42.4535μs | 23.5552 KOps/s | 23.6614 KOps/s | |
test_values_stack_nested_locked | 0.4422ms | 39.1398μs | 25.5494 KOps/s | 25.7724 KOps/s | |
test_membership | 19.9839μs | 0.5013μs | 1.9949 MOps/s | 2.0157 MOps/s | |
test_membership_nested | 0.2004ms | 2.0075μs | 498.1239 KOps/s | 496.5029 KOps/s | |
test_membership_nested_leaf | 0.1977ms | 2.0518μs | 487.3661 KOps/s | 498.6331 KOps/s | |
test_membership_stacked_nested | 26.4400μs | 2.1237μs | 470.8866 KOps/s | 482.1684 KOps/s | |
test_membership_stacked_nested_leaf | 0.3928ms | 2.1009μs | 475.9909 KOps/s | 481.1493 KOps/s | |
test_membership_nested_last | 32.9910μs | 3.1255μs | 319.9494 KOps/s | 325.0588 KOps/s | |
test_membership_nested_leaf_last | 0.4245ms | 3.1198μs | 320.5371 KOps/s | 320.2674 KOps/s | |
test_membership_stacked_nested_last | 30.0410μs | 3.0979μs | 322.8004 KOps/s | 328.1469 KOps/s | |
test_membership_stacked_nested_leaf_last | 27.2400μs | 3.0960μs | 322.9954 KOps/s | 327.4948 KOps/s | |
test_nested_getleaf | 0.4161ms | 6.2310μs | 160.4870 KOps/s | 161.6806 KOps/s | |
test_nested_get | 0.3996ms | 5.9221μs | 168.8582 KOps/s | 166.0395 KOps/s | |
test_stacked_getleaf | 28.4000μs | 6.1399μs | 162.8691 KOps/s | 162.6176 KOps/s | |
test_stacked_get | 0.4027ms | 5.8258μs | 171.6497 KOps/s | 173.0998 KOps/s | |
test_nested_getitemleaf | 35.7600μs | 6.3632μs | 157.1541 KOps/s | 154.5819 KOps/s | |
test_nested_getitem | 0.3916ms | 6.0739μs | 164.6386 KOps/s | 164.2798 KOps/s | |
test_stacked_getitemleaf | 43.1610μs | 6.3967μs | 156.3312 KOps/s | 155.9057 KOps/s | |
test_stacked_getitem | 0.3957ms | 5.9702μs | 167.4975 KOps/s | 166.2375 KOps/s | |
test_lock_nested | 9.6300ms | 0.3446ms | 2.9020 KOps/s | 2.8738 KOps/s | |
test_lock_stack_nested | 0.4041ms | 0.3382ms | 2.9566 KOps/s | 2.9005 KOps/s | |
test_unlock_nested | 0.3892ms | 0.2815ms | 3.5525 KOps/s | 3.5642 KOps/s | |
test_unlock_stack_nested | 0.3187ms | 0.2791ms | 3.5824 KOps/s | 3.5353 KOps/s | |
test_flatten_speed | 0.1080ms | 77.9501μs | 12.8287 KOps/s | 12.9273 KOps/s | |
test_unflatten_speed | 0.7034ms | 0.3199ms | 3.1264 KOps/s | 3.1381 KOps/s | |
test_common_ops | 0.7644ms | 0.6266ms | 1.5960 KOps/s | 1.6326 KOps/s | |
test_creation | 0.1402ms | 1.7515μs | 570.9505 KOps/s | 576.4118 KOps/s | |
test_creation_empty | 0.3981ms | 9.5693μs | 104.5013 KOps/s | 112.0848 KOps/s | |
test_creation_nested_1 | 37.8410μs | 11.3535μs | 88.0789 KOps/s | 94.6292 KOps/s | |
test_creation_nested_2 | 47.9000μs | 13.9463μs | 71.7034 KOps/s | 74.6022 KOps/s | |
test_clone | 0.4034ms | 10.5967μs | 94.3693 KOps/s | 90.8953 KOps/s | |
test_getitem[int] | 1.3544ms | 10.5696μs | 94.6105 KOps/s | 93.3829 KOps/s | |
test_getitem[slice_int] | 0.1112ms | 20.2142μs | 49.4701 KOps/s | 47.6388 KOps/s | |
test_getitem[range] | 0.1522ms | 36.6548μs | 27.2816 KOps/s | 26.7366 KOps/s | |
test_getitem[tuple] | 0.4192ms | 17.8721μs | 55.9531 KOps/s | 54.8944 KOps/s | |
test_getitem[list] | 0.1354ms | 31.9175μs | 31.3307 KOps/s | 29.6886 KOps/s | |
test_setitem_dim[int] | 39.3300μs | 18.5717μs | 53.8454 KOps/s | 50.9990 KOps/s | |
test_setitem_dim[slice_int] | 70.9410μs | 37.9418μs | 26.3561 KOps/s | 25.8934 KOps/s | |
test_setitem_dim[range] | 78.2010μs | 51.7807μs | 19.3122 KOps/s | 18.8563 KOps/s | |
test_setitem_dim[tuple] | 52.4510μs | 31.9635μs | 31.2857 KOps/s | 31.4109 KOps/s | |
test_setitem | 45.9210μs | 15.6339μs | 63.9634 KOps/s | 63.2270 KOps/s | |
test_set | 61.8210μs | 15.3076μs | 65.3271 KOps/s | 66.6374 KOps/s | |
test_set_shared | 0.5855ms | 0.1573ms | 6.3571 KOps/s | 6.3795 KOps/s | |
test_update | 0.4163ms | 19.1607μs | 52.1901 KOps/s | 55.1358 KOps/s | |
test_update_nested | 60.1110μs | 24.3641μs | 41.0440 KOps/s | 41.5892 KOps/s | |
test_update__nested | 0.5553ms | 25.1373μs | 39.7815 KOps/s | 39.3940 KOps/s | |
test_set_nested | 58.0210μs | 16.9137μs | 59.1237 KOps/s | 59.6240 KOps/s | |
test_set_nested_new | 55.2410μs | 19.0586μs | 52.4699 KOps/s | 53.8588 KOps/s | |
test_select | 0.4499ms | 30.8582μs | 32.4063 KOps/s | 33.4355 KOps/s | |
test_select_nested | 0.4604ms | 43.6235μs | 22.9234 KOps/s | 22.8678 KOps/s | |
test_exclude_nested | 0.1034ms | 63.0002μs | 15.8730 KOps/s | 15.8760 KOps/s | |
test_empty[True] | 0.6913ms | 0.2974ms | 3.3623 KOps/s | 3.4229 KOps/s | |
test_empty[False] | 39.2237μs | 0.8240μs | 1.2135 MOps/s | 1.2004 MOps/s | |
test_to | 92.7610μs | 55.0285μs | 18.1724 KOps/s | 18.2858 KOps/s | |
test_to_nonblocking | 0.4475ms | 46.7800μs | 21.3767 KOps/s | 20.8156 KOps/s | |
test_unbind_speed | 0.2737ms | 0.2395ms | 4.1752 KOps/s | 4.2048 KOps/s | |
test_unbind_speed_stack0 | 0.6293ms | 0.2342ms | 4.2701 KOps/s | 4.1388 KOps/s | |
test_unbind_speed_stack1 | 95.4707ms | 0.7374ms | 1.3561 KOps/s | 1.3403 KOps/s | |
test_split | 96.7774ms | 1.5768ms | 634.1976 Ops/s | 627.9906 Ops/s | |
test_chunk | 0.1017s | 1.5819ms | 632.1510 Ops/s | 622.4277 Ops/s | |
test_consolidate[False-None] | 0.1013s | 2.9856ms | 334.9453 Ops/s | 337.4001 Ops/s | |
test_consolidate[default-None] | 2.1002ms | 1.6850ms | 593.4672 Ops/s | 599.3801 Ops/s | |
test_consolidate[reduce-overhead-None] | 1.9010ms | 1.7115ms | 584.2972 Ops/s | 581.4107 Ops/s | |
test_consolidate_njt[False-None] | 6.9941ms | 6.6016ms | 151.4778 Ops/s | 155.8323 Ops/s | |
test_to[False-False-None] | 1.8178ms | 1.6766ms | 596.4608 Ops/s | 592.2260 Ops/s | |
test_to[True-False-None] | 1.6539ms | 1.3043ms | 766.6979 Ops/s | 770.2144 Ops/s | |
test_to[within-False-None] | 4.2283ms | 4.0796ms | 245.1236 Ops/s | 244.9105 Ops/s | |
test_to[True-default-None] | 5.4602ms | 5.0966ms | 196.2086 Ops/s | 196.9032 Ops/s | |
test_to_njt[False-False-None] | 7.1291ms | 6.8079ms | 146.8889 Ops/s | 144.9245 Ops/s | |
test_to_njt[True-False-None] | 5.8936ms | 5.5129ms | 181.3927 Ops/s | 185.2851 Ops/s | |
test_to_njt[within-False-None] | 12.8468ms | 12.2341ms | 81.7390 Ops/s | 84.5975 Ops/s | |
test_creation[device0] | 0.6443ms | 78.8273μs | 12.6860 KOps/s | 12.3755 KOps/s | |
test_creation_from_tensor | 0.4458ms | 85.5727μs | 11.6860 KOps/s | 11.7580 KOps/s | |
test_add_one[memmap_tensor0] | 0.5303ms | 6.6008μs | 151.4957 KOps/s | 147.6743 KOps/s | |
test_contiguous[memmap_tensor0] | 20.2833μs | 0.4036μs | 2.4777 MOps/s | 2.4653 MOps/s | |
test_stack[memmap_tensor0] | 27.7510μs | 4.2606μs | 234.7080 KOps/s | 229.3484 KOps/s | |
test_memmaptd_index | 1.8043ms | 0.2380ms | 4.2015 KOps/s | 3.9429 KOps/s | |
test_memmaptd_index_astensor | 0.6987ms | 0.2985ms | 3.3500 KOps/s | 3.2723 KOps/s | |
test_memmaptd_index_op | 0.9629ms | 0.5896ms | 1.6959 KOps/s | 1.7074 KOps/s | |
test_serialize_model | 0.1309s | 0.1296s | 7.7143 Ops/s | 7.7120 Ops/s | |
test_serialize_model_pickle | 1.3483s | 1.1896s | 0.8406 Ops/s | 0.8245 Ops/s | |
test_serialize_weights | 0.1303s | 0.1291s | 7.7444 Ops/s | 7.7026 Ops/s | |
test_serialize_weights_returnearly | 0.3447s | 61.6574ms | 16.2186 Ops/s | 11.4226 Ops/s | |
test_serialize_weights_pickle | 1.3779s | 1.1903s | 0.8401 Ops/s | 0.8235 Ops/s | |
test_reshape_pytree | 49.3510μs | 21.8449μs | 45.7773 KOps/s | 44.9844 KOps/s | |
test_reshape_td | 60.4410μs | 25.7586μs | 38.8220 KOps/s | 36.4430 KOps/s | |
test_view_pytree | 52.0210μs | 21.6275μs | 46.2374 KOps/s | 46.3249 KOps/s | |
test_view_td | 70.5410μs | 31.1466μs | 32.1062 KOps/s | 30.0345 KOps/s | |
test_unbind_pytree | 0.1261ms | 27.7633μs | 36.0188 KOps/s | 35.3364 KOps/s | |
test_unbind_td | 0.6684ms | 35.5819μs | 28.1042 KOps/s | 27.2754 KOps/s | |
test_split_pytree | 57.3010μs | 28.9664μs | 34.5228 KOps/s | 33.9438 KOps/s | |
test_split_td | 0.8013ms | 37.7831μs | 26.4668 KOps/s | 25.4574 KOps/s | |
test_add_pytree | 68.5210μs | 33.1755μs | 30.1428 KOps/s | 28.1791 KOps/s | |
test_add_td | 95.0720μs | 50.4831μs | 19.8086 KOps/s | 19.1080 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1878ms | 0.1201ms | 8.3263 KOps/s | 7.8939 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2248ms | 0.1298ms | 7.7021 KOps/s | 7.5327 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1406ms | 93.3130μs | 10.7166 KOps/s | 10.5601 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 1.5993ms | 0.1475ms | 6.7804 KOps/s | 6.7753 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 62.6210μs | 29.2442μs | 34.1948 KOps/s | 43.4980 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 61.8810μs | 29.5111μs | 33.8856 KOps/s | 34.4072 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.4782ms | 63.6880μs | 15.7016 KOps/s | 15.5587 KOps/s | |
test_compile_copy_nested[pytree-eager] | 78.0620μs | 48.6781μs | 20.5431 KOps/s | 20.4686 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.1836ms | 0.1416ms | 7.0620 KOps/s | 7.1414 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.6036ms | 0.2168ms | 4.6125 KOps/s | 4.6956 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1491ms | 96.7590μs | 10.3350 KOps/s | 10.3862 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.4452ms | 56.2862μs | 17.7664 KOps/s | 18.2957 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.2548ms | 0.1370ms | 7.3000 KOps/s | 7.3100 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.8617ms | 0.4643ms | 2.1537 KOps/s | 2.1618 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.6602ms | 0.2624ms | 3.8111 KOps/s | 3.8917 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.1847ms | 0.1417ms | 7.0574 KOps/s | 6.9922 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1707ms | 69.1642μs | 14.4583 KOps/s | 14.8332 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1370ms | 97.2141μs | 10.2866 KOps/s | 10.3624 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.5185ms | 0.3934ms | 2.5417 KOps/s | 2.5829 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.1734ms | 0.1320ms | 7.5737 KOps/s | 7.4496 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 0.4021ms | 18.1425μs | 55.1192 KOps/s | 56.0682 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.4111ms | 30.8108μs | 32.4562 KOps/s | 32.1784 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1129ms | 70.3218μs | 14.2203 KOps/s | 14.4159 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.4308ms | 53.0964μs | 18.8337 KOps/s | 19.2983 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 1.6011ms | 0.3858ms | 2.5918 KOps/s | 2.2764 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.5927ms | 2.5087ms | 398.6159 Ops/s | 401.6857 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 1.5623ms | 0.4243ms | 2.3569 KOps/s | 2.1822 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 2.7999ms | 2.5372ms | 394.1366 Ops/s | 385.3130 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.6384ms | 0.1162ms | 8.6032 KOps/s | 8.3516 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.5793ms | 77.5716μs | 12.8913 KOps/s | 11.8898 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.4062ms | 0.1053ms | 9.4944 KOps/s | 8.9732 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.1605ms | 69.6714μs | 14.3531 KOps/s | 14.4950 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.1683ms | 0.1127ms | 8.8706 KOps/s | 8.9259 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.1461ms | 70.3001μs | 14.2247 KOps/s | 14.1781 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1436ms | 0.1036ms | 9.6528 KOps/s | 10.1395 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.1569ms | 18.7461μs | 53.3446 KOps/s | 55.8395 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1509ms | 0.1006ms | 9.9392 KOps/s | 10.4851 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 0.1643ms | 15.8990μs | 62.8972 KOps/s | 63.1123 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1719ms | 0.1009ms | 9.9093 KOps/s | 10.1826 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 55.1910μs | 15.8387μs | 63.1363 KOps/s | 64.2301 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1418ms | 99.6458μs | 10.0355 KOps/s | 9.9196 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.5829ms | 16.8287μs | 59.4222 KOps/s | 58.6032 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.2265ms | 0.1010ms | 9.9044 KOps/s | 10.1046 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 55.2710μs | 15.7240μs | 63.5971 KOps/s | 63.7605 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1565ms | 0.1004ms | 9.9595 KOps/s | 10.3886 KOps/s | |
test_compile_indexing[int-pytree-eager] | 76.4510μs | 15.6748μs | 63.7966 KOps/s | 63.1634 KOps/s | |
test_mod_add[eager] | 77.2610μs | 38.8933μs | 25.7114 KOps/s | 26.1189 KOps/s | |
test_mod_add[compile] | 0.2232ms | 81.1361μs | 12.3250 KOps/s | 12.5918 KOps/s | |
test_mod_add[compile-overhead] | 0.3395ms | 0.1687ms | 5.9291 KOps/s | 5.5642 KOps/s | |
test_mod_wrap[eager] | 0.3994ms | 0.2486ms | 4.0220 KOps/s | 3.7653 KOps/s | |
test_mod_wrap[compile] | 0.3715ms | 0.2894ms | 3.4558 KOps/s | 3.5342 KOps/s | |
test_mod_wrap[compile-overhead] | 7.4233ms | 3.9453ms | 253.4653 Ops/s | 277.5998 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.9184ms | 1.3798ms | 724.7348 Ops/s | 702.0287 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.4392ms | 1.3414ms | 745.4941 Ops/s | 727.5782 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.5124ms | 1.0169ms | 983.4290 Ops/s | 932.2187 Ops/s | |
test_seq_add[eager] | 0.1829ms | 0.1167ms | 8.5662 KOps/s | 8.4213 KOps/s | |
test_seq_add[compile] | 0.1488ms | 88.9370μs | 11.2439 KOps/s | 11.4798 KOps/s | |
test_seq_add[compile-overhead] | 0.2410ms | 0.1286ms | 7.7750 KOps/s | 7.8013 KOps/s | |
test_seq_wrap[eager] | 0.5072ms | 0.4375ms | 2.2856 KOps/s | 2.3320 KOps/s | |
test_seq_wrap[compile] | 0.3723ms | 0.2965ms | 3.3727 KOps/s | 3.3146 KOps/s | |
test_seq_wrap[compile-overhead] | 0.2996ms | 0.2229ms | 4.4867 KOps/s | 4.4612 KOps/s | |
test_func_call_runtime[False-eager] | 0.7958ms | 0.7303ms | 1.3693 KOps/s | 1.3595 KOps/s | |
test_func_call_runtime[False-compile] | 0.9881ms | 0.7312ms | 1.3676 KOps/s | 1.3602 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.4073ms | 0.3610ms | 2.7698 KOps/s | 2.7772 KOps/s | |
test_func_call_runtime[True-eager] | 0.9542ms | 0.8886ms | 1.1254 KOps/s | 1.1106 KOps/s | |
test_func_call_runtime[True-compile] | 0.8265ms | 0.7570ms | 1.3210 KOps/s | 1.3164 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.4403ms | 0.3834ms | 2.6082 KOps/s | 2.6369 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.7741ms | 0.7192ms | 1.3904 KOps/s | 1.3669 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.7928ms | 0.7341ms | 1.3622 KOps/s | 1.3568 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.4516ms | 0.3618ms | 2.7640 KOps/s | 2.7667 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.1178ms | 0.9944ms | 1.0056 KOps/s | 993.4817 Ops/s | |
test_func_call_cm_runtime[True-compile] | 1.0772ms | 0.9744ms | 1.0263 KOps/s | 973.4933 Ops/s | |
test_func_call_cm_runtime[True-compile-overhead] | 1.1730ms | 0.9863ms | 1.0138 KOps/s | 1.0024 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.4946ms | 2.0863ms | 479.3280 Ops/s | 481.5663 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.9932ms | 0.8356ms | 1.1967 KOps/s | 1.2571 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.4721ms | 0.4139ms | 2.4163 KOps/s | 2.4258 KOps/s | |
test_distributed | 2.8847ms | 0.2818ms | 3.5485 KOps/s | 8.6967 KOps/s | |
test_tdmodule | 37.3100μs | 22.4014μs | 44.6400 KOps/s | 47.1617 KOps/s | |
test_tdmodule_dispatch | 0.2322ms | 39.8361μs | 25.1029 KOps/s | 26.7031 KOps/s | |
test_tdseq | 42.4010μs | 22.3333μs | 44.7762 KOps/s | 46.8204 KOps/s | |
test_tdseq_dispatch | 66.6410μs | 41.5287μs | 24.0797 KOps/s | 25.0002 KOps/s | |
test_instantiation_functorch | 1.8169ms | 1.5103ms | 662.1158 Ops/s | 644.0760 Ops/s | |
test_exec_functorch | 0.1759ms | 0.1399ms | 7.1496 KOps/s | 6.9078 KOps/s | |
test_exec_functional_call | 0.1829ms | 0.1315ms | 7.6062 KOps/s | 7.2346 KOps/s | |
test_exec_td_decorator | 0.3656ms | 0.1813ms | 5.5171 KOps/s | 5.3271 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.8434ms | 0.6936ms | 1.4418 KOps/s | 1.4737 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8069ms | 0.6862ms | 1.4572 KOps/s | 1.4724 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7289ms | 0.5953ms | 1.6799 KOps/s | 1.7122 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7077ms | 0.5922ms | 1.6886 KOps/s | 1.7044 KOps/s | |
test_vmap_transformer_speed_decorator[True-True] | 20.0170ms | 19.3777ms | 51.6058 Ops/s | 53.1433 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 20.1294ms | 19.3573ms | 51.6600 Ops/s | 53.1043 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 19.9004ms | 19.1882ms | 52.1154 Ops/s | 53.5428 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 19.6122ms | 19.0202ms | 52.5758 Ops/s | 53.6541 Ops/s | |
test_to_module_speed[True] | 1.4704ms | 0.9613ms | 1.0403 KOps/s | 1.0363 KOps/s | |
test_to_module_speed[False] | 1.0314ms | 0.9465ms | 1.0565 KOps/s | 1.0582 KOps/s | |
test_tc_init | 68.7310μs | 36.8840μs | 27.1120 KOps/s | 27.7665 KOps/s | |
test_tc_init_nested | 0.1126ms | 74.2653μs | 13.4652 KOps/s | 13.8627 KOps/s | |
test_tc_first_layer_tensor | 21.1800μs | 0.8028μs | 1.2456 MOps/s | 1.2496 MOps/s | |
test_tc_first_layer_nontensor | 24.0400μs | 2.2279μs | 448.8475 KOps/s | 448.5806 KOps/s | |
test_tc_second_layer_tensor | 10.6003μs | 1.4171μs | 705.6680 KOps/s | 711.2167 KOps/s | |
test_tc_second_layer_nontensor | 27.1610μs | 2.9344μs | 340.7850 KOps/s | 339.7535 KOps/s | |
test_unbind | 0.2236s | 12.2128ms | 81.8812 Ops/s | 143.6831 Ops/s | |
test_full_like | 10.8512ms | 9.6985ms | 103.1082 Ops/s | 101.2308 Ops/s | |
test_zeros_like | 9.4350ms | 7.3612ms | 135.8468 Ops/s | 227.6491 Ops/s | |
test_ones_like | 4.9592ms | 4.4258ms | 225.9467 Ops/s | 226.4821 Ops/s | |
test_clone | 12.3900ms | 9.4592ms | 105.7175 Ops/s | 148.1773 Ops/s | |
test_squeeze | 60.2210μs | 9.7124μs | 102.9608 KOps/s | 102.7055 KOps/s | |
test_unsqueeze | 0.1297ms | 72.8661μs | 13.7238 KOps/s | 13.5617 KOps/s | |
test_split | 0.3620ms | 0.1547ms | 6.4635 KOps/s | 6.2375 KOps/s | |
test_permute | 0.2768ms | 0.1789ms | 5.5907 KOps/s | 5.4153 KOps/s | |
test_stack | 52.2232ms | 51.3137ms | 19.4880 Ops/s | 19.5148 Ops/s | |
test_cat | 52.4159ms | 51.1281ms | 19.5587 Ops/s | 19.5865 Ops/s |
else: | ||
keys: set[str] = set(keys) | ||
keys_set: set[str] = set(keys) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, is it much more efficient using set
rather than using always the other option? Or there is another reason?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
torch.compile used to not understand set() that's all. I should check if it's still the case
@@ -626,7 +626,6 @@ def stack_fn(key, values, is_not_init, is_tensor): | |||
key: stack_fn(key, values, is_not_init, is_tensor) | |||
for key, (values, is_not_init, is_tensor) in out.items() | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if this added space is on purpose.
tensordict/utils.py
Outdated
raise KeyError( | ||
f"got keys {keys} and {set(td.keys())} which are incompatible" | ||
) | ||
return keys | ||
if strict: | ||
return keys |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return keys | |
return list(keys) |
pretty sure that's what you mean with your comment, but just to be on the safe side. Rn, the return type is not consistent with typing.
return keys | ||
if strict: | ||
return keys | ||
return keys_set |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By curiosity, what are the downstream functions that would be impacted by this? In other words, in which context is _check_keys(strict=False)
used?
tc1 = MyTensorClass(foo=torch.zeros((1,)), bar=torch.ones((1,))) | ||
|
||
for _ in range(10000): | ||
assert list(torch.stack([tc1, tc1], dim=0)._tensordict.keys()) == [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert list(torch.stack([tc1, tc1], dim=0)._tensordict.keys()) == [ | |
assert list(torch.stack([tc1, tc1], dim=0).keys()) == [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was on purpose to avoid any artifacts caused by @tensorclass
(if there had been any)
ghstack-source-id: a46518942d70508046c27351a68580e3957b0371 Pull Request resolved: #1230
(cherry picked from commit c35d7aa)
Stack from ghstack (oldest at bottom):