-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BugFix] _PASSTHROUGH_MEMO for passthrough tensorclass #1231
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
vmoens
added a commit
that referenced
this pull request
Feb 24, 2025
ghstack-source-id: 20e6f797afe30a4bd8f45fcd4e3b9a8f5af4bb4d Pull Request resolved: #1231
vmoens
added a commit
that referenced
this pull request
Feb 24, 2025
ghstack-source-id: 0bfbfc9f6700f1165fcfd6b38f65fa4fd806be80 Pull Request resolved: #1231
vmoens
added a commit
that referenced
this pull request
Feb 24, 2025
ghstack-source-id: 0bfbfc9f6700f1165fcfd6b38f65fa4fd806be80 Pull Request resolved: #1231
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 42.2090μs | 21.0452μs | 47.5169 KOps/s | 49.3762 KOps/s | |
test_plain_set_stack_nested | 84.4190μs | 20.9759μs | 47.6737 KOps/s | 47.6987 KOps/s | |
test_plain_set_nested_inplace | 49.4830μs | 22.5617μs | 44.3228 KOps/s | 44.4651 KOps/s | |
test_plain_set_stack_nested_inplace | 85.6910μs | 22.9341μs | 43.6033 KOps/s | 44.9390 KOps/s | |
test_items | 38.6830μs | 4.2249μs | 236.6896 KOps/s | 235.3418 KOps/s | |
test_items_nested | 0.5920ms | 0.4131ms | 2.4209 KOps/s | 2.4636 KOps/s | |
test_items_nested_locked | 0.7647ms | 0.4087ms | 2.4468 KOps/s | 2.4765 KOps/s | |
test_items_nested_leaf | 0.2086ms | 78.5919μs | 12.7240 KOps/s | 13.0317 KOps/s | |
test_items_stack_nested | 0.8268ms | 0.4104ms | 2.4367 KOps/s | 2.4657 KOps/s | |
test_items_stack_nested_leaf | 0.1484ms | 78.3256μs | 12.7672 KOps/s | 12.9147 KOps/s | |
test_items_stack_nested_locked | 0.5526ms | 0.4096ms | 2.4415 KOps/s | 2.4346 KOps/s | |
test_keys | 28.3340μs | 3.5972μs | 277.9941 KOps/s | 282.8771 KOps/s | |
test_keys_nested | 0.2686ms | 0.1666ms | 6.0010 KOps/s | 6.0553 KOps/s | |
test_keys_nested_locked | 2.8613ms | 0.1722ms | 5.8085 KOps/s | 5.8531 KOps/s | |
test_keys_nested_leaf | 0.2369ms | 0.1451ms | 6.8922 KOps/s | 6.9563 KOps/s | |
test_keys_stack_nested | 0.3152ms | 0.1668ms | 5.9959 KOps/s | 6.0848 KOps/s | |
test_keys_stack_nested_leaf | 0.2319ms | 0.1447ms | 6.9125 KOps/s | 7.0270 KOps/s | |
test_keys_stack_nested_locked | 0.2463ms | 0.1745ms | 5.7311 KOps/s | 5.9025 KOps/s | |
test_values | 11.1450μs | 1.0626μs | 941.0859 KOps/s | 984.9276 KOps/s | |
test_values_nested | 0.1271ms | 63.8692μs | 15.6570 KOps/s | 16.2091 KOps/s | |
test_values_nested_locked | 0.1478ms | 64.6662μs | 15.4640 KOps/s | 16.1170 KOps/s | |
test_values_nested_leaf | 0.1570ms | 73.2387μs | 13.6540 KOps/s | 14.1280 KOps/s | |
test_values_stack_nested | 0.1346ms | 64.3068μs | 15.5505 KOps/s | 15.6120 KOps/s | |
test_values_stack_nested_leaf | 0.1460ms | 72.9270μs | 13.7123 KOps/s | 14.0289 KOps/s | |
test_values_stack_nested_locked | 0.1456ms | 64.0185μs | 15.6205 KOps/s | 16.1190 KOps/s | |
test_membership | 15.8390μs | 0.9259μs | 1.0800 MOps/s | 1.1573 MOps/s | |
test_membership_nested | 25.9590μs | 3.0327μs | 329.7413 KOps/s | 347.8855 KOps/s | |
test_membership_nested_leaf | 67.5060μs | 2.9550μs | 338.4092 KOps/s | 345.8257 KOps/s | |
test_membership_stacked_nested | 35.1160μs | 2.9984μs | 333.5165 KOps/s | 346.2982 KOps/s | |
test_membership_stacked_nested_leaf | 49.0560μs | 2.8576μs | 349.9475 KOps/s | 344.5824 KOps/s | |
test_membership_nested_last | 52.4890μs | 4.4175μs | 226.3718 KOps/s | 229.7431 KOps/s | |
test_membership_nested_leaf_last | 29.0250μs | 4.3041μs | 232.3376 KOps/s | 227.7414 KOps/s | |
test_membership_stacked_nested_last | 24.8670μs | 4.4760μs | 223.4147 KOps/s | 230.4807 KOps/s | |
test_membership_stacked_nested_leaf_last | 35.0050μs | 4.3239μs | 231.2743 KOps/s | 227.1677 KOps/s | |
test_nested_getleaf | 64.0970μs | 10.4943μs | 95.2897 KOps/s | 94.9816 KOps/s | |
test_nested_get | 48.9920μs | 9.8622μs | 101.3976 KOps/s | 100.7958 KOps/s | |
test_stacked_getleaf | 33.9440μs | 10.5966μs | 94.3700 KOps/s | 95.1953 KOps/s | |
test_stacked_get | 37.5610μs | 10.2217μs | 97.8313 KOps/s | 98.7389 KOps/s | |
test_nested_getitemleaf | 44.7940μs | 11.2375μs | 88.9876 KOps/s | 87.7183 KOps/s | |
test_nested_getitem | 60.5940μs | 10.8212μs | 92.4111 KOps/s | 92.6202 KOps/s | |
test_stacked_getitemleaf | 53.6410μs | 11.1529μs | 89.6631 KOps/s | 88.3512 KOps/s | |
test_stacked_getitem | 38.0820μs | 10.7405μs | 93.1059 KOps/s | 93.6451 KOps/s | |
test_lock_nested | 0.7321ms | 0.4183ms | 2.3908 KOps/s | 2.4456 KOps/s | |
test_lock_stack_nested | 0.6962ms | 0.4291ms | 2.3304 KOps/s | 2.3646 KOps/s | |
test_unlock_nested | 0.5227ms | 0.3402ms | 2.9398 KOps/s | 2.9697 KOps/s | |
test_unlock_stack_nested | 0.4553ms | 0.3440ms | 2.9068 KOps/s | 2.9053 KOps/s | |
test_flatten_speed | 0.1930ms | 99.4073μs | 10.0596 KOps/s | 9.8910 KOps/s | |
test_unflatten_speed | 0.5990ms | 0.5088ms | 1.9654 KOps/s | 1.8866 KOps/s | |
test_common_ops | 1.3485ms | 0.8125ms | 1.2308 KOps/s | 1.2369 KOps/s | |
test_creation | 49.6540μs | 2.4773μs | 403.6594 KOps/s | 390.5947 KOps/s | |
test_creation_empty | 43.5820μs | 12.7938μs | 78.1629 KOps/s | 84.9393 KOps/s | |
test_creation_nested_1 | 44.7240μs | 15.6198μs | 64.0212 KOps/s | 69.1661 KOps/s | |
test_creation_nested_2 | 52.0280μs | 20.2665μs | 49.3425 KOps/s | 52.2632 KOps/s | |
test_clone | 83.5770μs | 13.2491μs | 75.4770 KOps/s | 72.4944 KOps/s | |
test_getitem[int] | 1.2785ms | 12.8804μs | 77.6376 KOps/s | 79.7445 KOps/s | |
test_getitem[slice_int] | 0.1354ms | 24.3660μs | 41.0408 KOps/s | 41.4256 KOps/s | |
test_getitem[range] | 0.1679ms | 49.3091μs | 20.2803 KOps/s | 18.9164 KOps/s | |
test_getitem[tuple] | 0.1324ms | 19.7607μs | 50.6055 KOps/s | 49.3418 KOps/s | |
test_getitem[list] | 0.1780ms | 45.7107μs | 21.8767 KOps/s | 20.9861 KOps/s | |
test_setitem_dim[int] | 46.1870μs | 25.0008μs | 39.9987 KOps/s | 39.4426 KOps/s | |
test_setitem_dim[slice_int] | 75.5320μs | 51.1797μs | 19.5390 KOps/s | 19.1939 KOps/s | |
test_setitem_dim[range] | 0.1330ms | 76.0866μs | 13.1429 KOps/s | 12.9201 KOps/s | |
test_setitem_dim[tuple] | 72.8270μs | 40.2747μs | 24.8295 KOps/s | 24.8485 KOps/s | |
test_setitem | 0.1417ms | 20.5686μs | 48.6178 KOps/s | 48.3601 KOps/s | |
test_set | 92.0630μs | 20.1596μs | 49.6042 KOps/s | 49.7350 KOps/s | |
test_set_shared | 5.2948ms | 0.1857ms | 5.3862 KOps/s | 5.2656 KOps/s | |
test_update | 0.1535ms | 23.5321μs | 42.4951 KOps/s | 43.4889 KOps/s | |
test_update_nested | 0.1410ms | 34.6457μs | 28.8636 KOps/s | 29.2521 KOps/s | |
test_update__nested | 0.6510ms | 32.5463μs | 30.7255 KOps/s | 28.0114 KOps/s | |
test_set_nested | 0.1521ms | 23.2432μs | 43.0234 KOps/s | 44.4850 KOps/s | |
test_set_nested_new | 89.6080μs | 26.7771μs | 37.3454 KOps/s | 37.3104 KOps/s | |
test_select | 86.9530μs | 43.1368μs | 23.1821 KOps/s | 23.6335 KOps/s | |
test_select_nested | 0.1148ms | 63.5673μs | 15.7314 KOps/s | 15.9366 KOps/s | |
test_exclude_nested | 0.1559ms | 81.4935μs | 12.2709 KOps/s | 12.3852 KOps/s | |
test_empty[True] | 0.5610ms | 0.4071ms | 2.4561 KOps/s | 2.4540 KOps/s | |
test_empty[False] | 13.3750μs | 1.4019μs | 713.3428 KOps/s | 725.5836 KOps/s | |
test_unbind_speed | 1.8861ms | 0.2722ms | 3.6740 KOps/s | 3.6998 KOps/s | |
test_unbind_speed_stack0 | 0.3579ms | 0.2701ms | 3.7026 KOps/s | 3.7373 KOps/s | |
test_unbind_speed_stack1 | 0.1338s | 0.7651ms | 1.3070 KOps/s | 1.1261 KOps/s | |
test_split | 0.1339s | 1.8252ms | 547.8953 Ops/s | 639.5275 Ops/s | |
test_chunk | 0.1457s | 1.8283ms | 546.9427 Ops/s | 547.1949 Ops/s | |
test_consolidate_njt[False-None] | 8.6810ms | 8.3276ms | 120.0826 Ops/s | 101.3909 Ops/s | |
test_creation[device0] | 4.6533ms | 95.2077μs | 10.5034 KOps/s | 10.5207 KOps/s | |
test_creation_from_tensor | 0.3263ms | 96.3441μs | 10.3795 KOps/s | 9.9934 KOps/s | |
test_add_one[memmap_tensor0] | 84.0880μs | 4.8545μs | 205.9934 KOps/s | 203.9354 KOps/s | |
test_contiguous[memmap_tensor0] | 10.8700μs | 0.5170μs | 1.9341 MOps/s | 1.9882 MOps/s | |
test_stack[memmap_tensor0] | 26.7100μs | 3.6004μs | 277.7431 KOps/s | 296.5546 KOps/s | |
test_memmaptd_index | 1.4028ms | 0.2299ms | 4.3488 KOps/s | 4.2704 KOps/s | |
test_memmaptd_index_astensor | 0.7257ms | 0.3170ms | 3.1548 KOps/s | 3.1127 KOps/s | |
test_memmaptd_index_op | 0.9716ms | 0.6033ms | 1.6576 KOps/s | 1.6801 KOps/s | |
test_serialize_model | 0.2651s | 0.1453s | 6.8821 Ops/s | 7.9163 Ops/s | |
test_serialize_model_pickle | 0.4511s | 0.3976s | 2.5152 Ops/s | 2.3616 Ops/s | |
test_serialize_weights | 0.1348s | 0.1280s | 7.8138 Ops/s | 7.6883 Ops/s | |
test_serialize_weights_returnearly | 0.1879s | 0.1709s | 5.8525 Ops/s | 5.7411 Ops/s | |
test_serialize_weights_pickle | 0.4521s | 0.4086s | 2.4474 Ops/s | 2.4988 Ops/s | |
test_serialize_weights_filesystem | 0.1823s | 0.1588s | 6.2960 Ops/s | 6.3282 Ops/s | |
test_serialize_model_filesystem | 0.1747s | 0.1616s | 6.1894 Ops/s | 5.8527 Ops/s | |
test_reshape_pytree | 53.4700μs | 25.5650μs | 39.1159 KOps/s | 35.5397 KOps/s | |
test_reshape_td | 77.4460μs | 33.0927μs | 30.2181 KOps/s | 31.1868 KOps/s | |
test_view_pytree | 0.1165ms | 26.7742μs | 37.3495 KOps/s | 38.3530 KOps/s | |
test_view_td | 87.3040μs | 40.0598μs | 24.9627 KOps/s | 25.4435 KOps/s | |
test_unbind_pytree | 63.1690μs | 28.7022μs | 34.8405 KOps/s | 33.5707 KOps/s | |
test_unbind_td | 0.3615ms | 42.0822μs | 23.7630 KOps/s | 25.5185 KOps/s | |
test_split_pytree | 81.2520μs | 28.6733μs | 34.8756 KOps/s | 34.7307 KOps/s | |
test_split_td | 0.5258ms | 45.9478μs | 21.7638 KOps/s | 22.2104 KOps/s | |
test_add_pytree | 87.4240μs | 36.4592μs | 27.4280 KOps/s | 27.6046 KOps/s | |
test_add_td | 0.1351ms | 57.2883μs | 17.4556 KOps/s | 17.1320 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1619ms | 70.4489μs | 14.1947 KOps/s | 14.1547 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.4018ms | 0.1763ms | 5.6723 KOps/s | 5.6395 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1194ms | 47.1115μs | 21.2262 KOps/s | 21.2217 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2363ms | 0.1237ms | 8.0830 KOps/s | 8.2148 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 95.0990μs | 30.2773μs | 33.0280 KOps/s | 35.0702 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1151ms | 61.1140μs | 16.3629 KOps/s | 16.8100 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1784ms | 83.4628μs | 11.9814 KOps/s | 12.4490 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1282ms | 69.7346μs | 14.3401 KOps/s | 14.8027 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.2164ms | 0.1154ms | 8.6671 KOps/s | 8.8913 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3244ms | 0.2205ms | 4.5350 KOps/s | 4.5061 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1161ms | 50.2173μs | 19.9135 KOps/s | 20.8161 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.2473ms | 69.1762μs | 14.4558 KOps/s | 14.7806 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.2260ms | 0.1071ms | 9.3401 KOps/s | 9.7552 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.4173ms | 0.2096ms | 4.7705 KOps/s | 4.6853 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3394ms | 0.2378ms | 4.2051 KOps/s | 4.2132 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.2087ms | 0.1149ms | 8.7046 KOps/s | 8.9404 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1460ms | 65.1199μs | 15.3563 KOps/s | 15.9984 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1040ms | 50.2513μs | 19.9000 KOps/s | 20.1383 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.2590ms | 0.1624ms | 6.1563 KOps/s | 6.0630 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.1762ms | 0.1071ms | 9.3377 KOps/s | 9.5226 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 0.1019ms | 22.6322μs | 44.1848 KOps/s | 44.5568 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1392ms | 69.3440μs | 14.4209 KOps/s | 14.7914 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1640ms | 82.8661μs | 12.0677 KOps/s | 11.9368 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1324ms | 68.6213μs | 14.5727 KOps/s | 14.3533 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.3493ms | 0.2205ms | 4.5353 KOps/s | 4.6536 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 1.7476ms | 1.4123ms | 708.0547 Ops/s | 728.8034 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.3076ms | 0.2150ms | 4.6521 KOps/s | 4.8367 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 1.1077ms | 0.8444ms | 1.1843 KOps/s | 1.2023 KOps/s | |
test_compile_assign_and_add_stack[compile] | 0.6218ms | 0.4640ms | 2.1553 KOps/s | 2.2391 KOps/s | |
test_compile_assign_and_add_stack[eager] | 4.7263ms | 2.8039ms | 356.6493 Ops/s | 372.9809 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.1488ms | 40.0100μs | 24.9938 KOps/s | 25.9530 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 1.0200ms | 33.3702μs | 29.9669 KOps/s | 30.6405 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 97.0720μs | 31.6563μs | 31.5893 KOps/s | 32.2200 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.1084ms | 23.5255μs | 42.5071 KOps/s | 43.3916 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.1130ms | 33.4641μs | 29.8827 KOps/s | 30.5832 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 63.4890μs | 22.7677μs | 43.9218 KOps/s | 43.1932 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1162ms | 54.2298μs | 18.4401 KOps/s | 18.8890 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.4004ms | 20.5951μs | 48.5553 KOps/s | 48.8054 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1062ms | 47.3615μs | 21.1142 KOps/s | 21.6377 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 81.4840μs | 18.5614μs | 53.8753 KOps/s | 53.6727 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1439ms | 47.5095μs | 21.0484 KOps/s | 21.3784 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 87.1930μs | 18.9727μs | 52.7074 KOps/s | 50.6956 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1394ms | 56.3316μs | 17.7520 KOps/s | 18.3793 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 1.1436ms | 19.4814μs | 51.3311 KOps/s | 49.7575 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1335ms | 48.2112μs | 20.7421 KOps/s | 21.0795 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 64.6720μs | 18.5929μs | 53.7838 KOps/s | 53.1513 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1126ms | 46.9638μs | 21.2930 KOps/s | 21.2631 KOps/s | |
test_compile_indexing[int-pytree-eager] | 89.6080μs | 19.1920μs | 52.1050 KOps/s | 53.6402 KOps/s | |
test_mod_add[eager] | 99.6370μs | 37.2548μs | 26.8421 KOps/s | 27.6797 KOps/s | |
test_mod_add[compile] | 0.1486ms | 66.8963μs | 14.9485 KOps/s | 15.6244 KOps/s | |
test_mod_add[compile-overhead] | 0.1537ms | 64.2478μs | 15.5647 KOps/s | 15.5918 KOps/s | |
test_mod_wrap[eager] | 0.3455ms | 0.2211ms | 4.5222 KOps/s | 4.4092 KOps/s | |
test_mod_wrap[compile] | 2.1238ms | 0.2264ms | 4.4177 KOps/s | 4.3001 KOps/s | |
test_mod_wrap[compile-overhead] | 0.3817ms | 0.2303ms | 4.3420 KOps/s | 4.2971 KOps/s | |
test_mod_wrap_and_backward[eager] | 18.9063ms | 14.1339ms | 70.7521 Ops/s | 78.7150 Ops/s | |
test_mod_wrap_and_backward[compile] | 16.4171ms | 12.6860ms | 78.8271 Ops/s | 71.6513 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 16.2723ms | 12.6032ms | 79.3452 Ops/s | 78.1036 Ops/s | |
test_seq_add[eager] | 0.2415ms | 0.1179ms | 8.4827 KOps/s | 8.0832 KOps/s | |
test_seq_add[compile] | 0.1346ms | 76.9590μs | 12.9939 KOps/s | 12.4177 KOps/s | |
test_seq_add[compile-overhead] | 0.1819ms | 75.5170μs | 13.2420 KOps/s | 12.9095 KOps/s | |
test_seq_wrap[eager] | 0.7683ms | 0.4591ms | 2.1784 KOps/s | 2.0808 KOps/s | |
test_seq_wrap[compile] | 0.4380ms | 0.2395ms | 4.1751 KOps/s | 3.9246 KOps/s | |
test_seq_wrap[compile-overhead] | 0.4598ms | 0.2401ms | 4.1654 KOps/s | 3.7452 KOps/s | |
test_func_call_runtime[False-eager] | 0.6861ms | 0.5443ms | 1.8371 KOps/s | 1.7735 KOps/s | |
test_func_call_runtime[False-compile] | 0.6100ms | 0.4515ms | 2.2148 KOps/s | 2.1605 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.6515ms | 0.4469ms | 2.2378 KOps/s | 2.1823 KOps/s | |
test_func_call_runtime[True-eager] | 1.0973ms | 0.7599ms | 1.3159 KOps/s | 1.2804 KOps/s | |
test_func_call_runtime[True-compile] | 0.6659ms | 0.4756ms | 2.1025 KOps/s | 2.0906 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.7267ms | 0.4780ms | 2.0920 KOps/s | 2.0125 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.9398ms | 0.5504ms | 1.8167 KOps/s | 1.8068 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.6174ms | 0.4493ms | 2.2255 KOps/s | 2.1720 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.7333ms | 0.4534ms | 2.2056 KOps/s | 2.1541 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.4935ms | 0.9111ms | 1.0976 KOps/s | 1.0734 KOps/s | |
test_func_call_cm_runtime[True-compile] | 1.2639ms | 0.7961ms | 1.2561 KOps/s | 1.2169 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 1.4830ms | 0.8148ms | 1.2273 KOps/s | 1.2284 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.6346ms | 1.9694ms | 507.7778 Ops/s | 505.5961 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 1.0489ms | 0.5518ms | 1.8122 KOps/s | 1.7577 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.9299ms | 0.5535ms | 1.8068 KOps/s | 1.7677 KOps/s | |
test_distributed | 0.8577ms | 0.1248ms | 8.0100 KOps/s | 7.6582 KOps/s | |
test_tdmodule | 52.0670μs | 28.1916μs | 35.4716 KOps/s | 33.5914 KOps/s | |
test_tdmodule_dispatch | 93.8670μs | 51.3967μs | 19.4565 KOps/s | 19.4296 KOps/s | |
test_tdseq | 83.2380μs | 30.8061μs | 32.4611 KOps/s | 33.8466 KOps/s | |
test_tdseq_dispatch | 83.8280μs | 57.8281μs | 17.2926 KOps/s | 18.1625 KOps/s | |
test_instantiation_functorch | 2.4054ms | 1.5589ms | 641.4612 Ops/s | 646.2565 Ops/s | |
test_exec_functorch | 0.3126ms | 0.1796ms | 5.5677 KOps/s | 5.6112 KOps/s | |
test_exec_functional_call | 0.3200ms | 0.1721ms | 5.8092 KOps/s | 5.6824 KOps/s | |
test_exec_td_decorator | 0.5982ms | 0.2310ms | 4.3299 KOps/s | 4.1077 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.9894ms | 0.6748ms | 1.4819 KOps/s | 1.4763 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.9048ms | 0.6701ms | 1.4922 KOps/s | 1.4866 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.9416ms | 0.5357ms | 1.8668 KOps/s | 1.8350 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.8870ms | 0.5328ms | 1.8770 KOps/s | 1.8370 KOps/s | |
test_to_module_speed[True] | 1.8245ms | 1.3735ms | 728.0706 Ops/s | 739.2391 Ops/s | |
test_to_module_speed[False] | 2.4649ms | 1.3763ms | 726.5743 Ops/s | 746.1946 Ops/s | |
test_tc_init | 81.2630μs | 48.5851μs | 20.5824 KOps/s | 21.3893 KOps/s | |
test_tc_init_nested | 0.1979ms | 97.6803μs | 10.2375 KOps/s | 10.6106 KOps/s | |
test_tc_first_layer_tensor | 15.2390μs | 1.5888μs | 629.3875 KOps/s | 655.1568 KOps/s | |
test_tc_first_layer_nontensor | 27.7720μs | 4.8879μs | 204.5880 KOps/s | 207.0078 KOps/s | |
test_tc_second_layer_tensor | 22.5830μs | 2.9323μs | 341.0338 KOps/s | 340.0552 KOps/s | |
test_tc_second_layer_nontensor | 53.1300μs | 6.2203μs | 160.7627 KOps/s | 164.2973 KOps/s | |
test_unbind | 0.2934s | 16.0596ms | 62.2680 Ops/s | 57.6449 Ops/s | |
test_full_like | 15.5259ms | 11.7458ms | 85.1368 Ops/s | 71.4591 Ops/s | |
test_zeros_like | 6.2247ms | 4.5161ms | 221.4286 Ops/s | 188.0807 Ops/s | |
test_ones_like | 5.9427ms | 4.3910ms | 227.7361 Ops/s | 135.6839 Ops/s | |
test_clone | 11.5144ms | 8.5893ms | 116.4241 Ops/s | 81.9101 Ops/s | |
test_squeeze | 65.6130μs | 13.0823μs | 76.4392 KOps/s | 81.2915 KOps/s | |
test_unsqueeze | 0.3335ms | 99.0908μs | 10.0918 KOps/s | 10.4715 KOps/s | |
test_split | 0.3534ms | 0.1971ms | 5.0733 KOps/s | 5.0658 KOps/s | |
test_permute | 0.4394ms | 0.2134ms | 4.6850 KOps/s | 4.9171 KOps/s | |
test_stack | 47.7760ms | 36.0412ms | 27.7460 Ops/s | 27.0646 Ops/s | |
test_cat | 42.7895ms | 34.6013ms | 28.9007 Ops/s | 27.6392 Ops/s |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):