-
Notifications
You must be signed in to change notification settings - Fork 83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BugFix] Fix tensordict.get in TDModule tensor retrieval for NonTensorData #1249
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
vmoens
added a commit
that referenced
this pull request
Mar 5, 2025
…rData ghstack-source-id: 240b54a9d87bdcf654fc29a95f9508fcc851d0a9 Pull Request resolved: #1249
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 42.2190μs | 21.8175μs | 45.8347 KOps/s | 48.0924 KOps/s | |
test_plain_set_stack_nested | 80.0790μs | 22.0482μs | 45.3552 KOps/s | 47.6989 KOps/s | |
test_plain_set_nested_inplace | 67.6460μs | 24.0551μs | 41.5713 KOps/s | 43.9995 KOps/s | |
test_plain_set_stack_nested_inplace | 60.2430μs | 23.8331μs | 41.9585 KOps/s | 43.9591 KOps/s | |
test_items | 27.2620μs | 4.1229μs | 242.5456 KOps/s | 241.6047 KOps/s | |
test_items_nested | 0.7160ms | 0.4085ms | 2.4478 KOps/s | 2.4700 KOps/s | |
test_items_nested_locked | 0.6192ms | 0.4090ms | 2.4452 KOps/s | 2.4683 KOps/s | |
test_items_nested_leaf | 0.1421ms | 78.9229μs | 12.6706 KOps/s | 12.9298 KOps/s | |
test_items_stack_nested | 0.5023ms | 0.4119ms | 2.4275 KOps/s | 2.4525 KOps/s | |
test_items_stack_nested_leaf | 0.1408ms | 77.9166μs | 12.8342 KOps/s | 12.8071 KOps/s | |
test_items_stack_nested_locked | 0.6002ms | 0.4140ms | 2.4156 KOps/s | 2.4747 KOps/s | |
test_keys | 32.8910μs | 3.5024μs | 285.5196 KOps/s | 284.8073 KOps/s | |
test_keys_nested | 0.2679ms | 0.1718ms | 5.8204 KOps/s | 6.0503 KOps/s | |
test_keys_nested_locked | 1.8915ms | 0.1770ms | 5.6512 KOps/s | 5.9282 KOps/s | |
test_keys_nested_leaf | 0.2417ms | 0.1486ms | 6.7277 KOps/s | 7.0167 KOps/s | |
test_keys_stack_nested | 0.3254ms | 0.1691ms | 5.9135 KOps/s | 6.1475 KOps/s | |
test_keys_stack_nested_leaf | 0.2975ms | 0.1506ms | 6.6387 KOps/s | 7.0062 KOps/s | |
test_keys_stack_nested_locked | 0.3159ms | 0.1738ms | 5.7529 KOps/s | 5.8628 KOps/s | |
test_values | 8.3656μs | 1.0686μs | 935.7960 KOps/s | 964.6756 KOps/s | |
test_values_nested | 0.1084ms | 62.6128μs | 15.9712 KOps/s | 16.2839 KOps/s | |
test_values_nested_locked | 0.1551ms | 63.4690μs | 15.7557 KOps/s | 16.2688 KOps/s | |
test_values_nested_leaf | 0.1310ms | 72.2197μs | 13.8466 KOps/s | 14.2201 KOps/s | |
test_values_stack_nested | 0.1144ms | 62.8572μs | 15.9091 KOps/s | 15.5645 KOps/s | |
test_values_stack_nested_leaf | 0.1480ms | 72.4076μs | 13.8107 KOps/s | 14.2236 KOps/s | |
test_values_stack_nested_locked | 0.1065ms | 63.9517μs | 15.6368 KOps/s | 16.2321 KOps/s | |
test_membership | 33.1020μs | 0.9118μs | 1.0968 MOps/s | 1.1668 MOps/s | |
test_membership_nested | 38.9930μs | 2.9962μs | 333.7512 KOps/s | 347.2926 KOps/s | |
test_membership_nested_leaf | 58.4310μs | 2.9422μs | 339.8843 KOps/s | 350.0283 KOps/s | |
test_membership_stacked_nested | 26.0890μs | 2.9551μs | 338.3988 KOps/s | 346.1704 KOps/s | |
test_membership_stacked_nested_leaf | 23.2440μs | 2.9771μs | 335.8946 KOps/s | 346.6153 KOps/s | |
test_membership_nested_last | 33.0120μs | 4.5148μs | 221.4931 KOps/s | 231.8916 KOps/s | |
test_membership_nested_leaf_last | 45.5650μs | 4.4737μs | 223.5262 KOps/s | 228.4797 KOps/s | |
test_membership_stacked_nested_last | 29.6860μs | 4.4365μs | 225.4025 KOps/s | 226.5458 KOps/s | |
test_membership_stacked_nested_leaf_last | 27.8920μs | 4.5074μs | 221.8591 KOps/s | 231.3487 KOps/s | |
test_nested_getleaf | 39.4340μs | 10.7933μs | 92.6502 KOps/s | 94.6815 KOps/s | |
test_nested_get | 44.6640μs | 10.0883μs | 99.1244 KOps/s | 99.7812 KOps/s | |
test_stacked_getleaf | 49.4120μs | 10.5053μs | 95.1904 KOps/s | 95.6405 KOps/s | |
test_stacked_get | 38.8420μs | 10.0622μs | 99.3822 KOps/s | 100.8575 KOps/s | |
test_nested_getitemleaf | 55.0430μs | 11.2413μs | 88.9573 KOps/s | 90.5670 KOps/s | |
test_nested_getitem | 38.3920μs | 10.7606μs | 92.9316 KOps/s | 95.6637 KOps/s | |
test_stacked_getitemleaf | 38.0610μs | 11.2245μs | 89.0910 KOps/s | 89.8590 KOps/s | |
test_stacked_getitem | 49.7130μs | 11.2520μs | 88.8728 KOps/s | 94.5642 KOps/s | |
test_lock_nested | 0.6929ms | 0.4130ms | 2.4212 KOps/s | 2.4218 KOps/s | |
test_lock_stack_nested | 0.7355ms | 0.4258ms | 2.3484 KOps/s | 2.3229 KOps/s | |
test_unlock_nested | 0.5141ms | 0.3381ms | 2.9575 KOps/s | 2.9979 KOps/s | |
test_unlock_stack_nested | 0.6336ms | 0.3422ms | 2.9223 KOps/s | 2.9201 KOps/s | |
test_flatten_speed | 0.1905ms | 0.1020ms | 9.8078 KOps/s | 9.9371 KOps/s | |
test_unflatten_speed | 0.9397ms | 0.5353ms | 1.8682 KOps/s | 1.9243 KOps/s | |
test_common_ops | 5.4260ms | 0.8566ms | 1.1674 KOps/s | 1.1746 KOps/s | |
test_creation | 55.8640μs | 2.5041μs | 399.3385 KOps/s | 408.5451 KOps/s | |
test_creation_empty | 43.7510μs | 13.3432μs | 74.9446 KOps/s | 76.7723 KOps/s | |
test_creation_nested_1 | 45.4450μs | 16.4829μs | 60.6691 KOps/s | 61.7576 KOps/s | |
test_creation_nested_2 | 49.8530μs | 20.6666μs | 48.3872 KOps/s | 48.0627 KOps/s | |
test_clone | 46.7980μs | 13.3675μs | 74.8082 KOps/s | 74.0283 KOps/s | |
test_getitem[int] | 0.7836ms | 12.4955μs | 80.0291 KOps/s | 74.1236 KOps/s | |
test_getitem[slice_int] | 0.1495ms | 24.3288μs | 41.1035 KOps/s | 38.6138 KOps/s | |
test_getitem[range] | 0.1951ms | 49.7842μs | 20.0867 KOps/s | 18.6511 KOps/s | |
test_getitem[tuple] | 0.1466ms | 20.2834μs | 49.3013 KOps/s | 46.3911 KOps/s | |
test_getitem[list] | 0.1885ms | 45.9009μs | 21.7861 KOps/s | 20.7449 KOps/s | |
test_setitem_dim[int] | 69.0690μs | 25.6552μs | 38.9785 KOps/s | 37.7521 KOps/s | |
test_setitem_dim[slice_int] | 82.8950μs | 52.2449μs | 19.1406 KOps/s | 18.7697 KOps/s | |
test_setitem_dim[range] | 0.1051ms | 76.4192μs | 13.0857 KOps/s | 12.6719 KOps/s | |
test_setitem_dim[tuple] | 77.9260μs | 40.6994μs | 24.5704 KOps/s | 22.8581 KOps/s | |
test_setitem | 79.3380μs | 21.2838μs | 46.9841 KOps/s | 44.7758 KOps/s | |
test_set | 62.9970μs | 20.4279μs | 48.9526 KOps/s | 45.5822 KOps/s | |
test_set_shared | 0.4281ms | 0.1812ms | 5.5188 KOps/s | 5.2307 KOps/s | |
test_update | 0.1359ms | 27.3422μs | 36.5735 KOps/s | 36.2614 KOps/s | |
test_update_nested | 0.1251ms | 42.9697μs | 23.2722 KOps/s | 23.1995 KOps/s | |
test_update__nested | 0.4727ms | 34.2715μs | 29.1788 KOps/s | 28.9616 KOps/s | |
test_set_nested | 71.4940μs | 22.8557μs | 43.7527 KOps/s | 43.1096 KOps/s | |
test_set_nested_new | 78.0560μs | 28.1143μs | 35.5690 KOps/s | 35.3124 KOps/s | |
test_select | 0.1016ms | 43.6802μs | 22.8937 KOps/s | 22.0905 KOps/s | |
test_select_nested | 0.1351ms | 63.2194μs | 15.8179 KOps/s | 15.6396 KOps/s | |
test_exclude_nested | 0.1493ms | 81.6715μs | 12.2442 KOps/s | 12.0103 KOps/s | |
test_empty[True] | 0.4929ms | 0.4100ms | 2.4392 KOps/s | 2.3735 KOps/s | |
test_empty[False] | 10.5900μs | 1.3921μs | 718.3310 KOps/s | 725.1638 KOps/s | |
test_unbind_speed | 0.3244ms | 0.2648ms | 3.7758 KOps/s | 3.6446 KOps/s | |
test_unbind_speed_stack0 | 0.4591ms | 0.2629ms | 3.8042 KOps/s | 3.7147 KOps/s | |
test_unbind_speed_stack1 | 0.1121s | 0.7386ms | 1.3538 KOps/s | 1.1820 KOps/s | |
test_split | 0.1438s | 1.8037ms | 554.4135 Ops/s | 538.7125 Ops/s | |
test_chunk | 0.1326s | 1.7822ms | 561.1187 Ops/s | 624.9480 Ops/s | |
test_consolidate_njt[False-None] | 9.7642ms | 8.2979ms | 120.5127 Ops/s | 105.4403 Ops/s | |
test_creation[device0] | 4.7092ms | 95.6980μs | 10.4495 KOps/s | 10.9153 KOps/s | |
test_creation_from_tensor | 0.2733ms | 95.4104μs | 10.4810 KOps/s | 10.1273 KOps/s | |
test_add_one[memmap_tensor0] | 0.1087ms | 4.9266μs | 202.9780 KOps/s | 194.9047 KOps/s | |
test_contiguous[memmap_tensor0] | 25.7980μs | 0.5169μs | 1.9346 MOps/s | 1.9621 MOps/s | |
test_stack[memmap_tensor0] | 21.5000μs | 3.4154μs | 292.7902 KOps/s | 281.6305 KOps/s | |
test_memmaptd_index | 0.9006ms | 0.2327ms | 4.2967 KOps/s | 4.2528 KOps/s | |
test_memmaptd_index_astensor | 0.5456ms | 0.3180ms | 3.1443 KOps/s | 3.0909 KOps/s | |
test_memmaptd_index_op | 0.8609ms | 0.6117ms | 1.6349 KOps/s | 1.6014 KOps/s | |
test_serialize_model | 0.2542s | 0.1432s | 6.9841 Ops/s | 7.6912 Ops/s | |
test_serialize_model_pickle | 0.4846s | 0.3983s | 2.5108 Ops/s | 2.5397 Ops/s | |
test_serialize_weights | 0.1279s | 0.1216s | 8.2267 Ops/s | 8.5226 Ops/s | |
test_serialize_weights_returnearly | 0.1928s | 0.1695s | 5.8993 Ops/s | 5.9504 Ops/s | |
test_serialize_weights_pickle | 0.4735s | 0.3939s | 2.5388 Ops/s | 2.5752 Ops/s | |
test_serialize_weights_filesystem | 0.1600s | 0.1541s | 6.4873 Ops/s | 6.6778 Ops/s | |
test_serialize_model_filesystem | 0.1682s | 0.1568s | 6.3780 Ops/s | 6.3629 Ops/s | |
test_reshape_pytree | 0.1002ms | 26.8336μs | 37.2667 KOps/s | 38.0698 KOps/s | |
test_reshape_td | 79.4790μs | 32.5743μs | 30.6990 KOps/s | 30.5904 KOps/s | |
test_view_pytree | 83.7660μs | 25.7801μs | 38.7896 KOps/s | 38.3533 KOps/s | |
test_view_td | 97.8830μs | 39.9061μs | 25.0588 KOps/s | 24.4587 KOps/s | |
test_unbind_pytree | 70.6320μs | 28.9853μs | 34.5002 KOps/s | 33.9982 KOps/s | |
test_unbind_td | 0.4278ms | 39.2386μs | 25.4851 KOps/s | 25.0078 KOps/s | |
test_split_pytree | 0.1122ms | 28.6713μs | 34.8781 KOps/s | 34.3366 KOps/s | |
test_split_td | 0.5666ms | 43.6630μs | 22.9027 KOps/s | 21.7317 KOps/s | |
test_add_pytree | 96.1100μs | 35.5644μs | 28.1180 KOps/s | 27.8319 KOps/s | |
test_add_td | 0.1864ms | 57.9464μs | 17.2573 KOps/s | 16.6600 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1523ms | 71.2964μs | 14.0260 KOps/s | 14.3624 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.3658ms | 0.1742ms | 5.7393 KOps/s | 5.8111 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1552ms | 47.6098μs | 21.0041 KOps/s | 21.0345 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2045ms | 0.1184ms | 8.4476 KOps/s | 8.3316 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 77.2750μs | 29.6540μs | 33.7223 KOps/s | 34.4364 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1393ms | 58.4650μs | 17.1043 KOps/s | 17.3181 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1747ms | 79.1782μs | 12.6297 KOps/s | 12.6622 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1340ms | 66.2276μs | 15.0995 KOps/s | 15.1573 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.1875ms | 0.1082ms | 9.2417 KOps/s | 9.1239 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3065ms | 0.2180ms | 4.5873 KOps/s | 4.6229 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1282ms | 48.6954μs | 20.5358 KOps/s | 20.9332 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1409ms | 67.8564μs | 14.7370 KOps/s | 15.0005 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.1796ms | 0.1004ms | 9.9626 KOps/s | 9.8312 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.3755ms | 0.2011ms | 4.9735 KOps/s | 4.9944 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.4108ms | 0.2369ms | 4.2208 KOps/s | 4.3055 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.1934ms | 0.1087ms | 9.2001 KOps/s | 9.0664 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1872ms | 65.7477μs | 15.2097 KOps/s | 16.0868 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.2461ms | 49.9836μs | 20.0066 KOps/s | 20.2035 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.2480ms | 0.1561ms | 6.4062 KOps/s | 6.4400 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.3388ms | 0.1042ms | 9.5930 KOps/s | 9.5150 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 98.1530μs | 21.8154μs | 45.8392 KOps/s | 46.1402 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1264ms | 68.1709μs | 14.6690 KOps/s | 14.6474 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.3006ms | 82.3604μs | 12.1418 KOps/s | 12.3167 KOps/s | |
test_compile_copy_flat[pytree-eager] | 1.3657ms | 67.1153μs | 14.8997 KOps/s | 14.5165 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.3873ms | 0.2180ms | 4.5868 KOps/s | 4.5550 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.9597ms | 1.4192ms | 704.6413 Ops/s | 716.3671 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.3019ms | 0.2135ms | 4.6847 KOps/s | 4.6832 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 1.6011ms | 0.8374ms | 1.1942 KOps/s | 1.2227 KOps/s | |
test_compile_assign_and_add_stack[compile] | 0.9465ms | 0.4791ms | 2.0872 KOps/s | 2.1867 KOps/s | |
test_compile_assign_and_add_stack[eager] | 4.5119ms | 2.8719ms | 348.2015 Ops/s | 355.1349 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.1110ms | 40.4323μs | 24.7327 KOps/s | 25.2243 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.5883ms | 32.7436μs | 30.5404 KOps/s | 29.9406 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 79.2390μs | 31.6704μs | 31.5752 KOps/s | 31.4550 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.9753ms | 22.3907μs | 44.6615 KOps/s | 43.3565 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.1113ms | 31.6286μs | 31.6170 KOps/s | 29.6791 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 62.3760μs | 22.3693μs | 44.7041 KOps/s | 43.0653 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1152ms | 54.3024μs | 18.4154 KOps/s | 18.4514 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.5998ms | 19.3780μs | 51.6050 KOps/s | 47.2635 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1012ms | 45.6275μs | 21.9166 KOps/s | 21.0990 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 67.1450μs | 18.4066μs | 54.3284 KOps/s | 52.7320 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1048ms | 46.9988μs | 21.2771 KOps/s | 20.4126 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 83.3160μs | 18.3890μs | 54.3802 KOps/s | 53.2602 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1344ms | 55.5682μs | 17.9959 KOps/s | 17.9639 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 1.4596ms | 19.2726μs | 51.8871 KOps/s | 48.7177 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1260ms | 47.2509μs | 21.1636 KOps/s | 20.7239 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 0.1084ms | 18.3738μs | 54.4253 KOps/s | 53.9245 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1101ms | 46.7249μs | 21.4019 KOps/s | 20.5592 KOps/s | |
test_compile_indexing[int-pytree-eager] | 72.1250μs | 18.1202μs | 55.1872 KOps/s | 53.6468 KOps/s | |
test_mod_add[eager] | 92.4430μs | 38.2190μs | 26.1650 KOps/s | 26.5064 KOps/s | |
test_mod_add[compile] | 0.1155ms | 67.1145μs | 14.8999 KOps/s | 14.4596 KOps/s | |
test_mod_add[compile-overhead] | 0.1458ms | 64.4099μs | 15.5256 KOps/s | 14.3398 KOps/s | |
test_mod_wrap[eager] | 0.3732ms | 0.2281ms | 4.3835 KOps/s | 4.2830 KOps/s | |
test_mod_wrap[compile] | 2.5664ms | 0.2312ms | 4.3251 KOps/s | 4.2621 KOps/s | |
test_mod_wrap[compile-overhead] | 0.3452ms | 0.2284ms | 4.3787 KOps/s | 4.2289 KOps/s | |
test_mod_wrap_and_backward[eager] | 18.2386ms | 13.2542ms | 75.4475 Ops/s | 70.0471 Ops/s | |
test_mod_wrap_and_backward[compile] | 17.6029ms | 12.3308ms | 81.0977 Ops/s | 83.7007 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 15.0677ms | 11.9771ms | 83.4924 Ops/s | 82.6890 Ops/s | |
test_seq_add[eager] | 0.2395ms | 0.1218ms | 8.2071 KOps/s | 8.0145 KOps/s | |
test_seq_add[compile] | 0.1577ms | 79.3270μs | 12.6061 KOps/s | 12.7994 KOps/s | |
test_seq_add[compile-overhead] | 0.1495ms | 76.1522μs | 13.1316 KOps/s | 12.7295 KOps/s | |
test_seq_wrap[eager] | 0.7527ms | 0.4611ms | 2.1688 KOps/s | 2.1979 KOps/s | |
test_seq_wrap[compile] | 0.3639ms | 0.2442ms | 4.0953 KOps/s | 4.0207 KOps/s | |
test_seq_wrap[compile-overhead] | 0.4306ms | 0.2459ms | 4.0663 KOps/s | 3.8407 KOps/s | |
test_func_call_runtime[False-eager] | 1.0095ms | 0.5424ms | 1.8437 KOps/s | 1.7865 KOps/s | |
test_func_call_runtime[False-compile] | 0.5456ms | 0.4457ms | 2.2436 KOps/s | 2.1956 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.5764ms | 0.4421ms | 2.2620 KOps/s | 2.2168 KOps/s | |
test_func_call_runtime[True-eager] | 1.0574ms | 0.7615ms | 1.3131 KOps/s | 1.3044 KOps/s | |
test_func_call_runtime[True-compile] | 0.8467ms | 0.4643ms | 2.1539 KOps/s | 2.1019 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.7416ms | 0.4645ms | 2.1527 KOps/s | 2.1051 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.6986ms | 0.5357ms | 1.8669 KOps/s | 1.8285 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.7215ms | 0.4406ms | 2.2695 KOps/s | 2.2144 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.5589ms | 0.4448ms | 2.2484 KOps/s | 2.2027 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.4895ms | 0.9013ms | 1.1096 KOps/s | 1.0858 KOps/s | |
test_func_call_cm_runtime[True-compile] | 1.2912ms | 0.8022ms | 1.2465 KOps/s | 1.2268 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 1.2834ms | 0.8075ms | 1.2385 KOps/s | 1.2123 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.9152ms | 1.9132ms | 522.6959 Ops/s | 508.5980 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.7195ms | 0.5461ms | 1.8312 KOps/s | 1.8333 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 1.0767ms | 0.5507ms | 1.8159 KOps/s | 1.8299 KOps/s | |
test_distributed | 0.2638ms | 0.1275ms | 7.8429 KOps/s | 7.6703 KOps/s | |
test_tdmodule | 55.5340μs | 28.2468μs | 35.4022 KOps/s | 34.0703 KOps/s | |
test_tdmodule_dispatch | 0.1050ms | 57.8760μs | 17.2783 KOps/s | 18.5709 KOps/s | |
test_tdseq | 50.3140μs | 29.8362μs | 33.5163 KOps/s | 30.8763 KOps/s | |
test_tdseq_dispatch | 0.1084ms | 56.6767μs | 17.6439 KOps/s | 16.9563 KOps/s | |
test_instantiation_functorch | 2.0229ms | 1.5376ms | 650.3532 Ops/s | 641.5213 Ops/s | |
test_exec_functorch | 0.4182ms | 0.1775ms | 5.6328 KOps/s | 5.2438 KOps/s | |
test_exec_functional_call | 0.2377ms | 0.1701ms | 5.8789 KOps/s | 5.4475 KOps/s | |
test_exec_td_decorator | 0.4995ms | 0.2334ms | 4.2842 KOps/s | 4.1068 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.9051ms | 0.6679ms | 1.4973 KOps/s | 1.4884 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.0100ms | 0.6644ms | 1.5051 KOps/s | 1.4785 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7632ms | 0.5379ms | 1.8589 KOps/s | 1.8326 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.8265ms | 0.5366ms | 1.8636 KOps/s | 1.8430 KOps/s | |
test_to_module_speed[True] | 1.9391ms | 1.3465ms | 742.6627 Ops/s | 746.9328 Ops/s | |
test_to_module_speed[False] | 2.4197ms | 1.3201ms | 757.4924 Ops/s | 760.3559 Ops/s | |
test_tc_init | 96.8510μs | 49.6295μs | 20.1493 KOps/s | 19.8210 KOps/s | |
test_tc_init_nested | 0.1636ms | 99.8125μs | 10.0188 KOps/s | 9.9350 KOps/s | |
test_tc_first_layer_tensor | 16.1200μs | 1.4943μs | 669.2174 KOps/s | 645.6936 KOps/s | |
test_tc_first_layer_nontensor | 24.1850μs | 4.6372μs | 215.6481 KOps/s | 207.8805 KOps/s | |
test_tc_second_layer_tensor | 19.5860μs | 2.7936μs | 357.9671 KOps/s | 344.1959 KOps/s | |
test_tc_second_layer_nontensor | 47.0280μs | 5.9039μs | 169.3809 KOps/s | 165.9873 KOps/s | |
test_unbind | 0.2884s | 14.8746ms | 67.2288 Ops/s | 70.2088 Ops/s | |
test_full_like | 14.0307ms | 8.9726ms | 111.4507 Ops/s | 108.1766 Ops/s | |
test_zeros_like | 4.9252ms | 3.1086ms | 321.6931 Ops/s | 185.2934 Ops/s | |
test_ones_like | 5.3385ms | 3.6838ms | 271.4574 Ops/s | 163.8343 Ops/s | |
test_clone | 11.7958ms | 7.3599ms | 135.8717 Ops/s | 136.8564 Ops/s | |
test_squeeze | 63.6300μs | 13.0551μs | 76.5982 KOps/s | 79.8446 KOps/s | |
test_unsqueeze | 0.1752ms | 95.2871μs | 10.4946 KOps/s | 10.4942 KOps/s | |
test_split | 0.4850ms | 0.1961ms | 5.0993 KOps/s | 5.0236 KOps/s | |
test_permute | 0.3370ms | 0.2057ms | 4.8625 KOps/s | 4.9240 KOps/s | |
test_stack | 34.8070ms | 29.4967ms | 33.9022 Ops/s | 34.1199 Ops/s | |
test_cat | 38.8881ms | 29.9612ms | 33.3765 Ops/s | 32.1995 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 0.1074ms | 11.4216μs | 87.5533 KOps/s | 80.4147 KOps/s | |
test_plain_set_stack_nested | 35.4400μs | 11.5189μs | 86.8136 KOps/s | 79.9237 KOps/s | |
test_plain_set_nested_inplace | 57.4010μs | 12.6087μs | 79.3101 KOps/s | 73.6004 KOps/s | |
test_plain_set_stack_nested_inplace | 45.0210μs | 12.5037μs | 79.9764 KOps/s | 73.9982 KOps/s | |
test_items | 31.5100μs | 2.9514μs | 338.8177 KOps/s | 343.7729 KOps/s | |
test_items_nested | 0.4065ms | 0.3644ms | 2.7444 KOps/s | 2.6925 KOps/s | |
test_items_nested_locked | 0.4622ms | 0.3711ms | 2.6947 KOps/s | 2.6767 KOps/s | |
test_items_nested_leaf | 0.1909ms | 59.8718μs | 16.7023 KOps/s | 16.4843 KOps/s | |
test_items_stack_nested | 0.3979ms | 0.3633ms | 2.7526 KOps/s | 2.7151 KOps/s | |
test_items_stack_nested_leaf | 96.0220μs | 60.4073μs | 16.5543 KOps/s | 16.5188 KOps/s | |
test_items_stack_nested_locked | 0.4375ms | 0.3688ms | 2.7112 KOps/s | 2.7018 KOps/s | |
test_keys | 27.9200μs | 3.4350μs | 291.1179 KOps/s | 292.4147 KOps/s | |
test_keys_nested | 0.1205ms | 87.6936μs | 11.4033 KOps/s | 11.2441 KOps/s | |
test_keys_nested_locked | 0.7953ms | 94.1573μs | 10.6205 KOps/s | 10.6393 KOps/s | |
test_keys_nested_leaf | 0.1035ms | 78.9524μs | 12.6659 KOps/s | 12.5863 KOps/s | |
test_keys_stack_nested | 0.1509ms | 88.5551μs | 11.2924 KOps/s | 11.3381 KOps/s | |
test_keys_stack_nested_leaf | 0.1079ms | 78.7996μs | 12.6904 KOps/s | 12.6293 KOps/s | |
test_keys_stack_nested_locked | 0.1336ms | 93.9092μs | 10.6486 KOps/s | 10.6134 KOps/s | |
test_values | 6.1100μs | 0.8495μs | 1.1772 MOps/s | 1.1725 MOps/s | |
test_values_nested | 73.3610μs | 36.9580μs | 27.0578 KOps/s | 26.5055 KOps/s | |
test_values_nested_locked | 92.9610μs | 38.8879μs | 25.7149 KOps/s | 24.9448 KOps/s | |
test_values_nested_leaf | 0.2058ms | 41.9322μs | 23.8480 KOps/s | 23.3882 KOps/s | |
test_values_stack_nested | 0.2274ms | 37.3130μs | 26.8003 KOps/s | 26.4292 KOps/s | |
test_values_stack_nested_leaf | 83.2810μs | 42.2324μs | 23.6785 KOps/s | 23.4756 KOps/s | |
test_values_stack_nested_locked | 0.1130ms | 38.8244μs | 25.7570 KOps/s | 25.1197 KOps/s | |
test_membership | 1.8855μs | 0.5073μs | 1.9713 MOps/s | 1.9871 MOps/s | |
test_membership_nested | 16.5355μs | 2.0247μs | 493.9083 KOps/s | 478.8865 KOps/s | |
test_membership_nested_leaf | 16.1300μs | 2.0351μs | 491.3691 KOps/s | 488.9973 KOps/s | |
test_membership_stacked_nested | 26.4500μs | 2.1195μs | 471.8169 KOps/s | 472.7817 KOps/s | |
test_membership_stacked_nested_leaf | 21.2110μs | 2.0750μs | 481.9356 KOps/s | 477.3244 KOps/s | |
test_membership_nested_last | 33.6010μs | 3.0706μs | 325.6721 KOps/s | 322.8513 KOps/s | |
test_membership_nested_leaf_last | 32.0910μs | 3.0661μs | 326.1513 KOps/s | 321.1120 KOps/s | |
test_membership_stacked_nested_last | 32.4900μs | 3.0719μs | 325.5339 KOps/s | 325.2223 KOps/s | |
test_membership_stacked_nested_leaf_last | 32.9400μs | 3.0593μs | 326.8669 KOps/s | 329.1413 KOps/s | |
test_nested_getleaf | 28.7900μs | 6.1920μs | 161.4989 KOps/s | 159.4366 KOps/s | |
test_nested_get | 29.2810μs | 5.9578μs | 167.8482 KOps/s | 167.7123 KOps/s | |
test_stacked_getleaf | 33.1600μs | 6.0694μs | 164.7617 KOps/s | 159.7520 KOps/s | |
test_stacked_get | 29.3410μs | 5.7972μs | 172.4968 KOps/s | 170.1054 KOps/s | |
test_nested_getitemleaf | 62.0710μs | 6.3661μs | 157.0817 KOps/s | 155.2454 KOps/s | |
test_nested_getitem | 49.4910μs | 6.2114μs | 160.9948 KOps/s | 164.0988 KOps/s | |
test_stacked_getitemleaf | 38.2400μs | 6.3836μs | 156.6512 KOps/s | 156.7995 KOps/s | |
test_stacked_getitem | 36.9910μs | 5.9495μs | 168.0817 KOps/s | 165.1413 KOps/s | |
test_lock_nested | 9.5651ms | 0.3578ms | 2.7948 KOps/s | 2.8482 KOps/s | |
test_lock_stack_nested | 0.4258ms | 0.3508ms | 2.8504 KOps/s | 2.8142 KOps/s | |
test_unlock_nested | 0.4258ms | 0.2885ms | 3.4658 KOps/s | 3.3915 KOps/s | |
test_unlock_stack_nested | 0.4123ms | 0.2870ms | 3.4842 KOps/s | 3.4321 KOps/s | |
test_flatten_speed | 0.1018ms | 77.2202μs | 12.9500 KOps/s | 13.0273 KOps/s | |
test_unflatten_speed | 0.3564ms | 0.3196ms | 3.1294 KOps/s | 3.1252 KOps/s | |
test_common_ops | 0.7554ms | 0.5922ms | 1.6885 KOps/s | 1.6040 KOps/s | |
test_creation | 75.4810μs | 1.7296μs | 578.1582 KOps/s | 561.5687 KOps/s | |
test_creation_empty | 33.1810μs | 6.3962μs | 156.3418 KOps/s | 120.9956 KOps/s | |
test_creation_nested_1 | 29.5710μs | 8.0707μs | 123.9044 KOps/s | 100.7997 KOps/s | |
test_creation_nested_2 | 42.4600μs | 10.7748μs | 92.8089 KOps/s | 79.5918 KOps/s | |
test_clone | 59.7610μs | 11.1851μs | 89.4046 KOps/s | 89.5267 KOps/s | |
test_getitem[int] | 1.2915ms | 10.9406μs | 91.4030 KOps/s | 91.7394 KOps/s | |
test_getitem[slice_int] | 0.1110ms | 21.1361μs | 47.3124 KOps/s | 47.6812 KOps/s | |
test_getitem[range] | 0.1300ms | 38.9360μs | 25.6832 KOps/s | 25.6932 KOps/s | |
test_getitem[tuple] | 0.1609ms | 18.3667μs | 54.4464 KOps/s | 54.3137 KOps/s | |
test_getitem[list] | 0.1395ms | 34.0662μs | 29.3546 KOps/s | 29.5485 KOps/s | |
test_setitem_dim[int] | 45.6700μs | 20.6624μs | 48.3971 KOps/s | 49.4645 KOps/s | |
test_setitem_dim[slice_int] | 0.1662ms | 39.4251μs | 25.3646 KOps/s | 26.0867 KOps/s | |
test_setitem_dim[range] | 0.1125ms | 54.6055μs | 18.3132 KOps/s | 18.6041 KOps/s | |
test_setitem_dim[tuple] | 55.6610μs | 33.2396μs | 30.0846 KOps/s | 31.0287 KOps/s | |
test_setitem | 48.1410μs | 14.7211μs | 67.9297 KOps/s | 65.5128 KOps/s | |
test_set | 61.0010μs | 13.9039μs | 71.9220 KOps/s | 66.6417 KOps/s | |
test_set_shared | 0.6043ms | 0.1608ms | 6.2182 KOps/s | 6.2164 KOps/s | |
test_update | 0.4313ms | 17.7339μs | 56.3890 KOps/s | 50.5916 KOps/s | |
test_update_nested | 0.1243ms | 25.9674μs | 38.5098 KOps/s | 34.8765 KOps/s | |
test_update__nested | 0.4870ms | 26.6273μs | 37.5554 KOps/s | 38.0382 KOps/s | |
test_set_nested | 55.6310μs | 15.9660μs | 62.6329 KOps/s | 61.2132 KOps/s | |
test_set_nested_new | 54.4510μs | 17.5632μs | 56.9373 KOps/s | 53.2997 KOps/s | |
test_select | 0.1664ms | 28.6137μs | 34.9484 KOps/s | 33.8981 KOps/s | |
test_select_nested | 93.3610μs | 43.6817μs | 22.8929 KOps/s | 22.5659 KOps/s | |
test_exclude_nested | 99.7620μs | 62.6267μs | 15.9676 KOps/s | 15.8337 KOps/s | |
test_empty[True] | 0.6249ms | 0.2983ms | 3.3523 KOps/s | 3.3416 KOps/s | |
test_empty[False] | 3.7711μs | 0.8289μs | 1.2065 MOps/s | 1.1787 MOps/s | |
test_to | 87.6210μs | 55.7596μs | 17.9341 KOps/s | 17.5340 KOps/s | |
test_to_nonblocking | 0.1963ms | 47.3763μs | 21.1076 KOps/s | 21.0980 KOps/s | |
test_unbind_speed | 0.3424ms | 0.2417ms | 4.1365 KOps/s | 4.0088 KOps/s | |
test_unbind_speed_stack0 | 0.3459ms | 0.2422ms | 4.1280 KOps/s | 4.1088 KOps/s | |
test_unbind_speed_stack1 | 96.5239ms | 0.7530ms | 1.3280 KOps/s | 1.3266 KOps/s | |
test_split | 0.1030s | 1.6423ms | 608.8900 Ops/s | 617.1093 Ops/s | |
test_chunk | 99.7653ms | 1.6309ms | 613.1515 Ops/s | 613.1219 Ops/s | |
test_consolidate[False-None] | 3.1708ms | 2.7367ms | 365.4078 Ops/s | 331.4914 Ops/s | |
test_consolidate[default-None] | 2.2267ms | 1.7595ms | 568.3398 Ops/s | 575.4572 Ops/s | |
test_consolidate[reduce-overhead-None] | 1.9314ms | 1.7923ms | 557.9570 Ops/s | 559.9651 Ops/s | |
test_consolidate_njt[False-None] | 0.3034s | 8.8172ms | 113.4152 Ops/s | 151.3319 Ops/s | |
test_to[False-False-None] | 1.9218ms | 1.7614ms | 567.7372 Ops/s | 567.1101 Ops/s | |
test_to[True-False-None] | 1.5570ms | 1.3518ms | 739.7518 Ops/s | 719.9636 Ops/s | |
test_to[within-False-None] | 6.1138ms | 4.2426ms | 235.7062 Ops/s | 235.9333 Ops/s | |
test_to[True-default-None] | 5.5710ms | 5.2426ms | 190.7437 Ops/s | 189.6560 Ops/s | |
test_to_njt[False-False-None] | 7.4484ms | 7.0167ms | 142.5161 Ops/s | 142.7106 Ops/s | |
test_to_njt[True-False-None] | 5.9318ms | 5.5058ms | 181.6273 Ops/s | 181.5517 Ops/s | |
test_to_njt[within-False-None] | 12.9424ms | 12.4122ms | 80.5659 Ops/s | 82.7022 Ops/s | |
test_creation[device0] | 0.4643ms | 81.2336μs | 12.3102 KOps/s | 12.2128 KOps/s | |
test_creation_from_tensor | 0.4638ms | 84.4027μs | 11.8480 KOps/s | 11.8178 KOps/s | |
test_add_one[memmap_tensor0] | 0.2467ms | 7.3240μs | 136.5377 KOps/s | 139.2712 KOps/s | |
test_contiguous[memmap_tensor0] | 1.9885μs | 0.4248μs | 2.3542 MOps/s | 2.3347 MOps/s | |
test_stack[memmap_tensor0] | 37.3710μs | 4.7993μs | 208.3655 KOps/s | 212.0409 KOps/s | |
test_memmaptd_index | 1.7147ms | 0.2505ms | 3.9915 KOps/s | 3.9198 KOps/s | |
test_memmaptd_index_astensor | 0.4717ms | 0.3107ms | 3.2188 KOps/s | 3.1520 KOps/s | |
test_memmaptd_index_op | 0.7147ms | 0.5654ms | 1.7686 KOps/s | 1.6788 KOps/s | |
test_serialize_model | 0.1343s | 0.1319s | 7.5795 Ops/s | 7.5792 Ops/s | |
test_serialize_model_pickle | 1.3495s | 1.2166s | 0.8220 Ops/s | 0.8214 Ops/s | |
test_serialize_weights | 0.2850s | 0.1532s | 6.5278 Ops/s | 7.6055 Ops/s | |
test_serialize_weights_returnearly | 0.3476s | 54.7941ms | 18.2501 Ops/s | 14.2749 Ops/s | |
test_serialize_weights_pickle | 1.3481s | 1.2138s | 0.8238 Ops/s | 0.8228 Ops/s | |
test_reshape_pytree | 60.6910μs | 22.2799μs | 44.8836 KOps/s | 44.1596 KOps/s | |
test_reshape_td | 0.1308ms | 27.2198μs | 36.7379 KOps/s | 36.7038 KOps/s | |
test_view_pytree | 55.1310μs | 22.1121μs | 45.2240 KOps/s | 45.0930 KOps/s | |
test_view_td | 0.1500ms | 33.0254μs | 30.2797 KOps/s | 28.8137 KOps/s | |
test_unbind_pytree | 0.1415ms | 28.5733μs | 34.9977 KOps/s | 33.7797 KOps/s | |
test_unbind_td | 0.5975ms | 37.7578μs | 26.4846 KOps/s | 26.1456 KOps/s | |
test_split_pytree | 71.1120μs | 30.7708μs | 32.4984 KOps/s | 30.1901 KOps/s | |
test_split_td | 0.7613ms | 38.5483μs | 25.9415 KOps/s | 24.9693 KOps/s | |
test_add_pytree | 0.1898ms | 38.6612μs | 25.8658 KOps/s | 27.3793 KOps/s | |
test_add_td | 0.1833ms | 46.6572μs | 21.4329 KOps/s | 19.5859 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.2698ms | 0.1215ms | 8.2331 KOps/s | 7.7774 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2829ms | 0.1335ms | 7.4920 KOps/s | 7.4385 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.2515ms | 96.6407μs | 10.3476 KOps/s | 10.0485 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 1.0485ms | 0.1568ms | 6.3773 KOps/s | 6.3767 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 0.1614ms | 24.9895μs | 40.0168 KOps/s | 41.1664 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 92.4820μs | 29.7506μs | 33.6127 KOps/s | 33.4732 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.4604ms | 63.8585μs | 15.6596 KOps/s | 15.4433 KOps/s | |
test_compile_copy_nested[pytree-eager] | 99.8610μs | 48.8156μs | 20.4852 KOps/s | 19.9889 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.3265ms | 0.1463ms | 6.8331 KOps/s | 6.9122 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3508ms | 0.2182ms | 4.5830 KOps/s | 4.6478 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.2248ms | 98.6419μs | 10.1377 KOps/s | 10.0453 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.2344ms | 57.3799μs | 17.4277 KOps/s | 17.7494 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.2863ms | 0.1381ms | 7.2397 KOps/s | 7.2563 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.6714ms | 0.5135ms | 1.9474 KOps/s | 1.9740 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3967ms | 0.2609ms | 3.8330 KOps/s | 3.8344 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.2904ms | 0.1437ms | 6.9581 KOps/s | 6.9272 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.2110ms | 68.7475μs | 14.5460 KOps/s | 13.8170 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.2485ms | 0.1019ms | 9.8124 KOps/s | 9.9897 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.5848ms | 0.4303ms | 2.3241 KOps/s | 2.3866 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.2920ms | 0.1416ms | 7.0607 KOps/s | 7.3522 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 0.1407ms | 18.8935μs | 52.9283 KOps/s | 55.0115 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.2023ms | 31.4816μs | 31.7646 KOps/s | 31.8702 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.2052ms | 69.9340μs | 14.2992 KOps/s | 14.2658 KOps/s | |
test_compile_copy_flat[pytree-eager] | 89.5820μs | 52.2572μs | 19.1361 KOps/s | 19.2372 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 1.6316ms | 0.3928ms | 2.5461 KOps/s | 2.2071 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 3.1520ms | 2.7520ms | 363.3743 Ops/s | 366.8237 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 1.6079ms | 0.4376ms | 2.2852 KOps/s | 2.2363 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 3.1818ms | 2.7908ms | 358.3217 Ops/s | 359.2773 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.5303ms | 0.1172ms | 8.5322 KOps/s | 8.3697 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.5737ms | 85.2890μs | 11.7248 KOps/s | 11.8105 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.3676ms | 0.1137ms | 8.7944 KOps/s | 9.0052 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.2295ms | 72.7685μs | 13.7422 KOps/s | 13.9317 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.2866ms | 0.1141ms | 8.7650 KOps/s | 8.8414 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.2547ms | 72.6287μs | 13.7687 KOps/s | 13.8954 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.2630ms | 0.1007ms | 9.9268 KOps/s | 9.7583 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.2091ms | 17.4526μs | 57.2981 KOps/s | 54.6475 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.2527ms | 97.9972μs | 10.2044 KOps/s | 10.2197 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 0.1564ms | 16.3738μs | 61.0732 KOps/s | 61.2527 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.2847ms | 0.1002ms | 9.9824 KOps/s | 10.1781 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 98.1010μs | 16.2447μs | 61.5584 KOps/s | 61.4512 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.2694ms | 0.1025ms | 9.7589 KOps/s | 9.7641 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.5611ms | 17.5046μs | 57.1280 KOps/s | 56.6133 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.2479ms | 97.6170μs | 10.2441 KOps/s | 10.0783 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 0.1450ms | 16.3281μs | 61.2443 KOps/s | 61.4417 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.2525ms | 99.3600μs | 10.0644 KOps/s | 10.1573 KOps/s | |
test_compile_indexing[int-pytree-eager] | 0.2029ms | 18.6245μs | 53.6928 KOps/s | 61.2818 KOps/s | |
test_mod_add[eager] | 0.2097ms | 38.9043μs | 25.7041 KOps/s | 23.6224 KOps/s | |
test_mod_add[compile] | 0.3060ms | 84.0015μs | 11.9045 KOps/s | 11.9298 KOps/s | |
test_mod_add[compile-overhead] | 0.3506ms | 0.1738ms | 5.7549 KOps/s | 5.5278 KOps/s | |
test_mod_wrap[eager] | 0.4374ms | 0.2592ms | 3.8585 KOps/s | 3.9341 KOps/s | |
test_mod_wrap[compile] | 0.4459ms | 0.2905ms | 3.4427 KOps/s | 3.3319 KOps/s | |
test_mod_wrap[compile-overhead] | 7.5776ms | 3.8062ms | 262.7282 Ops/s | 274.2315 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.5592ms | 1.3776ms | 725.9224 Ops/s | 681.1760 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.5578ms | 1.2954ms | 771.9525 Ops/s | 708.9248 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.3932ms | 0.9431ms | 1.0603 KOps/s | 910.4340 Ops/s | |
test_seq_add[eager] | 0.2874ms | 0.1148ms | 8.7082 KOps/s | 8.2420 KOps/s | |
test_seq_add[compile] | 0.3968ms | 93.4585μs | 10.6999 KOps/s | 10.9071 KOps/s | |
test_seq_add[compile-overhead] | 0.2927ms | 0.1318ms | 7.5893 KOps/s | 7.4698 KOps/s | |
test_seq_wrap[eager] | 0.6275ms | 0.4194ms | 2.3846 KOps/s | 2.1908 KOps/s | |
test_seq_wrap[compile] | 0.4766ms | 0.3079ms | 3.2478 KOps/s | 3.1492 KOps/s | |
test_seq_wrap[compile-overhead] | 0.3882ms | 0.2297ms | 4.3532 KOps/s | 4.2876 KOps/s | |
test_func_call_runtime[False-eager] | 0.9271ms | 0.7499ms | 1.3336 KOps/s | 1.3349 KOps/s | |
test_func_call_runtime[False-compile] | 0.9213ms | 0.7711ms | 1.2969 KOps/s | 1.2952 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.5358ms | 0.3711ms | 2.6949 KOps/s | 2.6714 KOps/s | |
test_func_call_runtime[True-eager] | 1.1342ms | 0.9454ms | 1.0577 KOps/s | 1.0809 KOps/s | |
test_func_call_runtime[True-compile] | 0.9350ms | 0.7940ms | 1.2595 KOps/s | 1.2681 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.5386ms | 0.3913ms | 2.5555 KOps/s | 2.5419 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.9381ms | 0.7753ms | 1.2899 KOps/s | 1.3408 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.9220ms | 0.7712ms | 1.2966 KOps/s | 1.2815 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.4604ms | 0.3726ms | 2.6837 KOps/s | 2.6679 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.2374ms | 1.0194ms | 980.9653 Ops/s | 980.8645 Ops/s | |
test_func_call_cm_runtime[True-compile] | 1.1730ms | 1.0091ms | 990.9790 Ops/s | 989.4429 Ops/s | |
test_func_call_cm_runtime[True-compile-overhead] | 1.2116ms | 1.0152ms | 985.0342 Ops/s | 984.4184 Ops/s | |
test_vmap_func_call_cm_runtime[eager] | 2.5540ms | 2.1242ms | 470.7717 Ops/s | 466.4184 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 1.0063ms | 0.8390ms | 1.1920 KOps/s | 1.1794 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.5856ms | 0.4218ms | 2.3709 KOps/s | 2.3399 KOps/s | |
test_distributed | 3.2845ms | 0.1818ms | 5.4994 KOps/s | 8.6963 KOps/s | |
test_tdmodule | 54.9410μs | 19.4882μs | 51.3131 KOps/s | 46.5123 KOps/s | |
test_tdmodule_dispatch | 0.1401ms | 34.3144μs | 29.1423 KOps/s | 27.1484 KOps/s | |
test_tdseq | 46.4010μs | 19.1935μs | 52.1010 KOps/s | 47.3002 KOps/s | |
test_tdseq_dispatch | 57.1600μs | 36.4153μs | 27.4610 KOps/s | 25.5841 KOps/s | |
test_instantiation_functorch | 1.7517ms | 1.5785ms | 633.5169 Ops/s | 631.5276 Ops/s | |
test_exec_functorch | 0.2780ms | 0.1479ms | 6.7631 KOps/s | 6.7539 KOps/s | |
test_exec_functional_call | 0.2657ms | 0.1425ms | 7.0190 KOps/s | 7.0666 KOps/s | |
test_exec_td_decorator | 0.3719ms | 0.1903ms | 5.2561 KOps/s | 5.1607 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.8333ms | 0.6859ms | 1.4579 KOps/s | 1.4420 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8863ms | 0.6868ms | 1.4560 KOps/s | 1.4432 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7419ms | 0.6001ms | 1.6664 KOps/s | 1.6063 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7447ms | 0.5999ms | 1.6669 KOps/s | 1.6214 KOps/s | |
test_vmap_transformer_speed_decorator[True-True] | 20.3179ms | 19.4448ms | 51.4277 Ops/s | 51.0126 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 19.7160ms | 19.4812ms | 51.3314 Ops/s | 51.0024 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 19.5114ms | 19.3055ms | 51.7988 Ops/s | 50.9618 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 19.4781ms | 19.3152ms | 51.7728 Ops/s | 51.2824 Ops/s | |
test_to_module_speed[True] | 1.4796ms | 0.9638ms | 1.0375 KOps/s | 1.0188 KOps/s | |
test_to_module_speed[False] | 1.1003ms | 0.9395ms | 1.0644 KOps/s | 1.0384 KOps/s | |
test_tc_init | 0.2356ms | 33.8305μs | 29.5591 KOps/s | 29.0336 KOps/s | |
test_tc_init_nested | 0.1253ms | 67.3421μs | 14.8495 KOps/s | 14.3394 KOps/s | |
test_tc_first_layer_tensor | 18.7100μs | 0.7979μs | 1.2533 MOps/s | 1.2677 MOps/s | |
test_tc_first_layer_nontensor | 15.4200μs | 2.2071μs | 453.0915 KOps/s | 453.6203 KOps/s | |
test_tc_second_layer_tensor | 26.7210μs | 1.5091μs | 662.6644 KOps/s | 711.1069 KOps/s | |
test_tc_second_layer_nontensor | 33.9200μs | 2.9315μs | 341.1167 KOps/s | 344.3119 KOps/s | |
test_unbind | 0.2211s | 12.0244ms | 83.1644 Ops/s | 140.6864 Ops/s | |
test_full_like | 11.0672ms | 9.6502ms | 103.6244 Ops/s | 103.9989 Ops/s | |
test_zeros_like | 18.1688ms | 7.2696ms | 137.5587 Ops/s | 230.4263 Ops/s | |
test_ones_like | 4.7792ms | 4.3572ms | 229.5055 Ops/s | 230.1816 Ops/s | |
test_clone | 7.5677ms | 6.8643ms | 145.6822 Ops/s | 147.9662 Ops/s | |
test_squeeze | 60.4310μs | 10.0071μs | 99.9289 KOps/s | 95.5177 KOps/s | |
test_unsqueeze | 0.1940ms | 74.7073μs | 13.3856 KOps/s | 13.3199 KOps/s | |
test_split | 0.3754ms | 0.1625ms | 6.1546 KOps/s | 6.1652 KOps/s | |
test_permute | 0.3240ms | 0.1865ms | 5.3620 KOps/s | 5.3050 KOps/s | |
test_stack | 53.5320ms | 52.1570ms | 19.1729 Ops/s | 19.5009 Ops/s | |
test_cat | 52.1717ms | 51.3828ms | 19.4618 Ops/s | 19.6528 Ops/s |
mikaylagawarecki
approved these changes
Mar 5, 2025
vmoens
added a commit
that referenced
this pull request
Mar 5, 2025
…rData ghstack-source-id: 240b54a9d87bdcf654fc29a95f9508fcc851d0a9 Pull Request resolved: #1249
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
bug
Something isn't working
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):