-
Notifications
You must be signed in to change notification settings - Fork 83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bugfix] Allow non Module in TensorDictModule when method is passed #1242
Merged
mikaylagawarecki
merged 4 commits into
gh/mikaylagawarecki/1/base
from
gh/mikaylagawarecki/1/head
Feb 28, 2025
Merged
[Bugfix] Allow non Module in TensorDictModule when method is passed #1242
mikaylagawarecki
merged 4 commits into
gh/mikaylagawarecki/1/base
from
gh/mikaylagawarecki/1/head
Feb 28, 2025
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
[ghstack-poisoned]
mikaylagawarecki
added a commit
that referenced
this pull request
Feb 28, 2025
ghstack-source-id: c73d86f924dd54060e9dfcf5b344a296deb1f6bf Pull Request resolved: #1242
…is passed" relax this assert as [LLM](https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/llm.py#L53) in vLLM does not subclass nn.Module [ghstack-poisoned]
mikaylagawarecki
added a commit
that referenced
this pull request
Feb 28, 2025
ghstack-source-id: 797b6ef132ef2481c7addb968b3eef8f9b3e49bd Pull Request resolved: #1242
…is passed" relax this check as [LLM](https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/llm.py#L53) in vLLM does not subclass nn.Module [ghstack-poisoned]
mikaylagawarecki
added a commit
that referenced
this pull request
Feb 28, 2025
ghstack-source-id: 9788c6b55172fc24bde2315c32d54ef694d44b5c Pull Request resolved: #1242
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 39.4840μs | 20.6973μs | 48.3155 KOps/s | 48.3898 KOps/s | |
test_plain_set_stack_nested | 84.2470μs | 20.9794μs | 47.6658 KOps/s | 47.8180 KOps/s | |
test_plain_set_nested_inplace | 47.7490μs | 22.6997μs | 44.0535 KOps/s | 43.4201 KOps/s | |
test_plain_set_stack_nested_inplace | 0.1307ms | 22.7573μs | 43.9420 KOps/s | 43.4835 KOps/s | |
test_items | 0.1576ms | 4.3877μs | 227.9123 KOps/s | 242.7472 KOps/s | |
test_items_nested | 0.8176ms | 0.4072ms | 2.4559 KOps/s | 2.4012 KOps/s | |
test_items_nested_locked | 0.5772ms | 0.4091ms | 2.4442 KOps/s | 2.3891 KOps/s | |
test_items_nested_leaf | 0.1420ms | 77.1848μs | 12.9559 KOps/s | 12.7215 KOps/s | |
test_items_stack_nested | 0.5766ms | 0.4084ms | 2.4485 KOps/s | 2.3434 KOps/s | |
test_items_stack_nested_leaf | 0.1530ms | 77.4556μs | 12.9106 KOps/s | 12.7196 KOps/s | |
test_items_stack_nested_locked | 0.9809ms | 0.4112ms | 2.4319 KOps/s | 2.4175 KOps/s | |
test_keys | 38.8220μs | 3.5892μs | 278.6145 KOps/s | 288.6873 KOps/s | |
test_keys_nested | 0.3254ms | 0.1642ms | 6.0913 KOps/s | 6.0212 KOps/s | |
test_keys_nested_locked | 1.7546ms | 0.1713ms | 5.8393 KOps/s | 5.8068 KOps/s | |
test_keys_nested_leaf | 0.2445ms | 0.1435ms | 6.9682 KOps/s | 6.9240 KOps/s | |
test_keys_stack_nested | 0.2491ms | 0.1648ms | 6.0674 KOps/s | 6.0954 KOps/s | |
test_keys_stack_nested_leaf | 0.2318ms | 0.1436ms | 6.9617 KOps/s | 6.9199 KOps/s | |
test_keys_stack_nested_locked | 0.2603ms | 0.1712ms | 5.8428 KOps/s | 5.8157 KOps/s | |
test_values | 10.9284μs | 1.0413μs | 960.2971 KOps/s | 952.8237 KOps/s | |
test_values_nested | 0.1127ms | 63.0468μs | 15.8612 KOps/s | 16.0291 KOps/s | |
test_values_nested_locked | 0.1425ms | 63.1960μs | 15.8238 KOps/s | 15.5449 KOps/s | |
test_values_nested_leaf | 0.1508ms | 72.9249μs | 13.7127 KOps/s | 13.9434 KOps/s | |
test_values_stack_nested | 0.1233ms | 63.4180μs | 15.7684 KOps/s | 15.9661 KOps/s | |
test_values_stack_nested_leaf | 0.1304ms | 73.1507μs | 13.6704 KOps/s | 13.9139 KOps/s | |
test_values_stack_nested_locked | 0.1134ms | 62.7849μs | 15.9274 KOps/s | 15.9026 KOps/s | |
test_membership | 40.9560μs | 0.8914μs | 1.1218 MOps/s | 1.4324 MOps/s | |
test_membership_nested | 44.7740μs | 2.9346μs | 340.7579 KOps/s | 341.1142 KOps/s | |
test_membership_nested_leaf | 58.3190μs | 2.9577μs | 338.0983 KOps/s | 351.1714 KOps/s | |
test_membership_stacked_nested | 27.0000μs | 2.9253μs | 341.8453 KOps/s | 348.3559 KOps/s | |
test_membership_stacked_nested_leaf | 46.4070μs | 2.9619μs | 337.6219 KOps/s | 347.7462 KOps/s | |
test_membership_nested_last | 35.1960μs | 4.4635μs | 224.0414 KOps/s | 230.9398 KOps/s | |
test_membership_nested_leaf_last | 45.5550μs | 4.5109μs | 221.6860 KOps/s | 230.6347 KOps/s | |
test_membership_stacked_nested_last | 17.4320μs | 4.4704μs | 223.6940 KOps/s | 228.0520 KOps/s | |
test_membership_stacked_nested_leaf_last | 31.8490μs | 4.4412μs | 225.1630 KOps/s | 231.0488 KOps/s | |
test_nested_getleaf | 32.5710μs | 10.8662μs | 92.0283 KOps/s | 92.1369 KOps/s | |
test_nested_get | 49.5920μs | 10.3061μs | 97.0296 KOps/s | 97.3916 KOps/s | |
test_stacked_getleaf | 50.6250μs | 10.8141μs | 92.4720 KOps/s | 93.5993 KOps/s | |
test_stacked_get | 50.0630μs | 10.0608μs | 99.3953 KOps/s | 96.9249 KOps/s | |
test_nested_getitemleaf | 52.8590μs | 11.4836μs | 87.0811 KOps/s | 88.2054 KOps/s | |
test_nested_getitem | 39.9140μs | 10.9520μs | 91.3078 KOps/s | 92.1901 KOps/s | |
test_stacked_getitemleaf | 47.5580μs | 11.5341μs | 86.6996 KOps/s | 87.8900 KOps/s | |
test_stacked_getitem | 51.8770μs | 10.9609μs | 91.2334 KOps/s | 91.4731 KOps/s | |
test_lock_nested | 0.7369ms | 0.4159ms | 2.4047 KOps/s | 2.4040 KOps/s | |
test_lock_stack_nested | 0.8525ms | 0.4249ms | 2.3537 KOps/s | 2.2971 KOps/s | |
test_unlock_nested | 0.4793ms | 0.3406ms | 2.9359 KOps/s | 2.9128 KOps/s | |
test_unlock_stack_nested | 0.7019ms | 0.3454ms | 2.8948 KOps/s | 2.8575 KOps/s | |
test_flatten_speed | 0.1912ms | 0.1014ms | 9.8608 KOps/s | 9.8251 KOps/s | |
test_unflatten_speed | 0.6263ms | 0.5302ms | 1.8860 KOps/s | 1.9183 KOps/s | |
test_common_ops | 0.9965ms | 0.8233ms | 1.2147 KOps/s | 1.1931 KOps/s | |
test_creation | 23.4840μs | 2.4995μs | 400.0761 KOps/s | 407.0284 KOps/s | |
test_creation_empty | 30.5770μs | 12.5170μs | 79.8912 KOps/s | 82.1345 KOps/s | |
test_creation_nested_1 | 48.8910μs | 15.6902μs | 63.7339 KOps/s | 65.3342 KOps/s | |
test_creation_nested_2 | 66.5740μs | 20.3030μs | 49.2538 KOps/s | 49.7384 KOps/s | |
test_clone | 85.4190μs | 13.8601μs | 72.1497 KOps/s | 69.1864 KOps/s | |
test_getitem[int] | 0.8512ms | 12.9701μs | 77.1002 KOps/s | 73.1661 KOps/s | |
test_getitem[slice_int] | 0.1304ms | 24.8971μs | 40.1653 KOps/s | 40.6564 KOps/s | |
test_getitem[range] | 0.1668ms | 51.2457μs | 19.5138 KOps/s | 19.3747 KOps/s | |
test_getitem[tuple] | 0.1238ms | 20.3794μs | 49.0691 KOps/s | 48.2062 KOps/s | |
test_getitem[list] | 0.1587ms | 46.9183μs | 21.3137 KOps/s | 21.2680 KOps/s | |
test_setitem_dim[int] | 80.2490μs | 25.6508μs | 38.9851 KOps/s | 38.5098 KOps/s | |
test_setitem_dim[slice_int] | 0.1130ms | 51.9713μs | 19.2414 KOps/s | 19.4859 KOps/s | |
test_setitem_dim[range] | 0.1395ms | 76.2449μs | 13.1156 KOps/s | 13.0028 KOps/s | |
test_setitem_dim[tuple] | 81.1910μs | 40.7243μs | 24.5553 KOps/s | 24.0970 KOps/s | |
test_setitem | 62.8470μs | 21.2282μs | 47.1072 KOps/s | 45.3539 KOps/s | |
test_set | 0.1033ms | 20.8298μs | 48.0082 KOps/s | 46.7752 KOps/s | |
test_set_shared | 0.3366ms | 0.1820ms | 5.4945 KOps/s | 5.4134 KOps/s | |
test_update | 0.1449ms | 26.7177μs | 37.4284 KOps/s | 36.2844 KOps/s | |
test_update_nested | 0.1033ms | 42.4913μs | 23.5343 KOps/s | 23.2048 KOps/s | |
test_update__nested | 0.4115ms | 34.2956μs | 29.1583 KOps/s | 28.5224 KOps/s | |
test_set_nested | 0.1007ms | 24.4684μs | 40.8691 KOps/s | 42.8729 KOps/s | |
test_set_nested_new | 84.9890μs | 27.9326μs | 35.8005 KOps/s | 35.0435 KOps/s | |
test_select | 94.0050μs | 43.7451μs | 22.8597 KOps/s | 22.5247 KOps/s | |
test_select_nested | 0.1216ms | 62.4368μs | 16.0162 KOps/s | 15.7986 KOps/s | |
test_exclude_nested | 0.1733ms | 81.6915μs | 12.2412 KOps/s | 12.2859 KOps/s | |
test_empty[True] | 0.4731ms | 0.4095ms | 2.4421 KOps/s | 2.4243 KOps/s | |
test_empty[False] | 7.1583μs | 1.3940μs | 717.3662 KOps/s | 745.5153 KOps/s | |
test_unbind_speed | 0.3934ms | 0.2697ms | 3.7083 KOps/s | 3.6217 KOps/s | |
test_unbind_speed_stack0 | 0.6065ms | 0.2664ms | 3.7538 KOps/s | 3.6632 KOps/s | |
test_unbind_speed_stack1 | 97.0589ms | 0.7210ms | 1.3870 KOps/s | 1.2291 KOps/s | |
test_split | 93.6195ms | 1.7502ms | 571.3726 Ops/s | 562.0086 Ops/s | |
test_chunk | 0.1064s | 1.7673ms | 565.8459 Ops/s | 621.5714 Ops/s | |
test_consolidate_njt[False-None] | 8.7011ms | 8.2903ms | 120.6222 Ops/s | 110.9778 Ops/s | |
test_creation[device0] | 3.9442ms | 93.7792μs | 10.6634 KOps/s | 10.5895 KOps/s | |
test_creation_from_tensor | 0.2213ms | 93.6355μs | 10.6797 KOps/s | 10.4082 KOps/s | |
test_add_one[memmap_tensor0] | 76.3130μs | 4.7787μs | 209.2637 KOps/s | 180.9273 KOps/s | |
test_contiguous[memmap_tensor0] | 15.4290μs | 0.5174μs | 1.9329 MOps/s | 1.7870 MOps/s | |
test_stack[memmap_tensor0] | 25.8790μs | 3.3650μs | 297.1770 KOps/s | 275.8783 KOps/s | |
test_memmaptd_index | 0.2985ms | 0.2260ms | 4.4246 KOps/s | 4.2762 KOps/s | |
test_memmaptd_index_astensor | 1.0591ms | 0.3143ms | 3.1817 KOps/s | 3.1230 KOps/s | |
test_memmaptd_index_op | 0.7934ms | 0.5899ms | 1.6951 KOps/s | 1.6138 KOps/s | |
test_serialize_model | 0.2068s | 0.1291s | 7.7489 Ops/s | 8.5096 Ops/s | |
test_serialize_model_pickle | 0.5048s | 0.4081s | 2.4504 Ops/s | 2.5525 Ops/s | |
test_serialize_weights | 0.1191s | 0.1133s | 8.8291 Ops/s | 8.8466 Ops/s | |
test_serialize_weights_returnearly | 0.1688s | 0.1594s | 6.2727 Ops/s | 6.3932 Ops/s | |
test_serialize_weights_pickle | 0.5715s | 0.4963s | 2.0150 Ops/s | 2.4433 Ops/s | |
test_serialize_weights_filesystem | 0.2458s | 0.1549s | 6.4567 Ops/s | 7.0547 Ops/s | |
test_serialize_model_filesystem | 0.1539s | 0.1450s | 6.8951 Ops/s | 6.7217 Ops/s | |
test_reshape_pytree | 67.3250μs | 26.3084μs | 38.0106 KOps/s | 32.5787 KOps/s | |
test_reshape_td | 66.4040μs | 32.9388μs | 30.3593 KOps/s | 29.0219 KOps/s | |
test_view_pytree | 82.1630μs | 26.4064μs | 37.8696 KOps/s | 37.8985 KOps/s | |
test_view_td | 99.5060μs | 40.7157μs | 24.5605 KOps/s | 24.4248 KOps/s | |
test_unbind_pytree | 77.7950μs | 29.9039μs | 33.4404 KOps/s | 34.2540 KOps/s | |
test_unbind_td | 0.3349ms | 40.2389μs | 24.8516 KOps/s | 24.8179 KOps/s | |
test_split_pytree | 63.9900μs | 29.3108μs | 34.1171 KOps/s | 34.9267 KOps/s | |
test_split_td | 0.1962ms | 46.5531μs | 21.4809 KOps/s | 21.6642 KOps/s | |
test_add_pytree | 77.9750μs | 35.3741μs | 28.2693 KOps/s | 26.6212 KOps/s | |
test_add_td | 0.1611ms | 57.2387μs | 17.4707 KOps/s | 16.4747 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1465ms | 67.8773μs | 14.7325 KOps/s | 14.2155 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 1.4163ms | 0.1713ms | 5.8378 KOps/s | 5.7265 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1029ms | 46.4360μs | 21.5350 KOps/s | 20.6754 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2588ms | 0.1183ms | 8.4531 KOps/s | 8.1517 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 67.2060μs | 28.7234μs | 34.8149 KOps/s | 35.6957 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1343ms | 58.3376μs | 17.1416 KOps/s | 16.6393 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1642ms | 79.7209μs | 12.5438 KOps/s | 12.5549 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1399ms | 67.3793μs | 14.8413 KOps/s | 15.0756 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.2000ms | 0.1081ms | 9.2474 KOps/s | 9.0558 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.4624ms | 0.2183ms | 4.5818 KOps/s | 4.6286 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1035ms | 48.2357μs | 20.7315 KOps/s | 20.2880 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1224ms | 67.1462μs | 14.8929 KOps/s | 14.4788 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.2163ms | 0.1019ms | 9.8172 KOps/s | 9.5475 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.3851ms | 0.2015ms | 4.9623 KOps/s | 4.7970 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.4646ms | 0.2329ms | 4.2940 KOps/s | 4.1815 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.2303ms | 0.1148ms | 8.7091 KOps/s | 8.9820 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1471ms | 64.0086μs | 15.6229 KOps/s | 15.1084 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1109ms | 49.7522μs | 20.0996 KOps/s | 19.5674 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.3602ms | 0.1578ms | 6.3379 KOps/s | 6.1936 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.2145ms | 0.1039ms | 9.6260 KOps/s | 9.5530 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 64.3900μs | 21.7545μs | 45.9676 KOps/s | 45.0968 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1477ms | 67.4215μs | 14.8321 KOps/s | 14.7614 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1591ms | 84.6391μs | 11.8149 KOps/s | 12.3174 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1353ms | 68.2653μs | 14.6487 KOps/s | 15.0353 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.2960ms | 0.2151ms | 4.6496 KOps/s | 4.5702 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.6497ms | 1.3798ms | 724.7549 Ops/s | 710.8960 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.3000ms | 0.2122ms | 4.7125 KOps/s | 4.6589 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 1.0480ms | 0.8181ms | 1.2223 KOps/s | 1.1641 KOps/s | |
test_compile_assign_and_add_stack[compile] | 0.8134ms | 0.4552ms | 2.1968 KOps/s | 2.1717 KOps/s | |
test_compile_assign_and_add_stack[eager] | 5.7102ms | 2.7643ms | 361.7542 Ops/s | 359.3660 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.1526ms | 41.3284μs | 24.1964 KOps/s | 24.7275 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.5828ms | 34.0300μs | 29.3859 KOps/s | 28.2949 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 89.8870μs | 31.4376μs | 31.8090 KOps/s | 30.6849 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 55.8140μs | 23.6015μs | 42.3703 KOps/s | 42.2404 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.1180ms | 32.6886μs | 30.5917 KOps/s | 30.0996 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 80.8110μs | 23.6001μs | 42.3728 KOps/s | 41.8338 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1095ms | 54.4487μs | 18.3659 KOps/s | 18.5874 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.3758ms | 20.4365μs | 48.9322 KOps/s | 48.5152 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1262ms | 47.8358μs | 20.9048 KOps/s | 21.0717 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 54.6420μs | 18.6170μs | 53.7142 KOps/s | 52.5382 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1090ms | 48.3288μs | 20.6916 KOps/s | 20.7158 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 53.2300μs | 18.7249μs | 53.4048 KOps/s | 53.5705 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1359ms | 55.2799μs | 18.0898 KOps/s | 18.1405 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.8709ms | 19.9942μs | 50.0145 KOps/s | 49.3710 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 99.9870μs | 48.2106μs | 20.7423 KOps/s | 20.7412 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 59.7310μs | 18.8827μs | 52.9586 KOps/s | 52.1518 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1299ms | 48.0431μs | 20.8147 KOps/s | 20.4647 KOps/s | |
test_compile_indexing[int-pytree-eager] | 73.0160μs | 18.8340μs | 53.0955 KOps/s | 53.4751 KOps/s | |
test_mod_add[eager] | 0.1118ms | 37.7161μs | 26.5139 KOps/s | 27.1829 KOps/s | |
test_mod_add[compile] | 0.1500ms | 69.7019μs | 14.3468 KOps/s | 14.4690 KOps/s | |
test_mod_add[compile-overhead] | 0.1484ms | 67.2378μs | 14.8726 KOps/s | 14.6809 KOps/s | |
test_mod_wrap[eager] | 0.3678ms | 0.2308ms | 4.3328 KOps/s | 4.2516 KOps/s | |
test_mod_wrap[compile] | 2.0931ms | 0.2320ms | 4.3111 KOps/s | 4.2481 KOps/s | |
test_mod_wrap[compile-overhead] | 0.3615ms | 0.2268ms | 4.4087 KOps/s | 4.3405 KOps/s | |
test_mod_wrap_and_backward[eager] | 16.3540ms | 13.4866ms | 74.1475 Ops/s | 92.2778 Ops/s | |
test_mod_wrap_and_backward[compile] | 16.2829ms | 13.4535ms | 74.3303 Ops/s | 88.5496 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 17.9116ms | 13.4699ms | 74.2397 Ops/s | 87.3015 Ops/s | |
test_seq_add[eager] | 0.2263ms | 0.1257ms | 7.9560 KOps/s | 8.1632 KOps/s | |
test_seq_add[compile] | 0.2176ms | 81.9649μs | 12.2003 KOps/s | 12.5210 KOps/s | |
test_seq_add[compile-overhead] | 0.1503ms | 78.9578μs | 12.6650 KOps/s | 12.9575 KOps/s | |
test_seq_wrap[eager] | 0.7526ms | 0.4593ms | 2.1771 KOps/s | 2.1511 KOps/s | |
test_seq_wrap[compile] | 0.4609ms | 0.2602ms | 3.8436 KOps/s | 3.9842 KOps/s | |
test_seq_wrap[compile-overhead] | 0.3850ms | 0.2493ms | 4.0108 KOps/s | 3.9735 KOps/s | |
test_func_call_runtime[False-eager] | 0.9011ms | 0.5538ms | 1.8059 KOps/s | 1.7577 KOps/s | |
test_func_call_runtime[False-compile] | 0.6564ms | 0.4510ms | 2.2171 KOps/s | 2.1964 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.8464ms | 0.4468ms | 2.2380 KOps/s | 2.2121 KOps/s | |
test_func_call_runtime[True-eager] | 0.9961ms | 0.7537ms | 1.3268 KOps/s | 1.2961 KOps/s | |
test_func_call_runtime[True-compile] | 0.5565ms | 0.4615ms | 2.1667 KOps/s | 2.1153 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.8385ms | 0.4653ms | 2.1491 KOps/s | 2.0908 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.7563ms | 0.5374ms | 1.8609 KOps/s | 1.7782 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.8125ms | 0.4412ms | 2.2668 KOps/s | 2.2075 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.8153ms | 0.4448ms | 2.2484 KOps/s | 2.2091 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.0939ms | 0.9124ms | 1.0960 KOps/s | 1.0948 KOps/s | |
test_func_call_cm_runtime[True-compile] | 1.5814ms | 0.8091ms | 1.2360 KOps/s | 1.2252 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 1.0274ms | 0.8175ms | 1.2232 KOps/s | 1.1988 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.7035ms | 1.9164ms | 521.8015 Ops/s | 518.1890 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.6633ms | 0.5429ms | 1.8419 KOps/s | 1.8158 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.7538ms | 0.5471ms | 1.8280 KOps/s | 1.8286 KOps/s | |
test_distributed | 0.2820ms | 0.1267ms | 7.8935 KOps/s | 7.7582 KOps/s | |
test_tdmodule | 54.9520μs | 28.3256μs | 35.3037 KOps/s | 35.2134 KOps/s | |
test_tdmodule_dispatch | 81.6330μs | 51.8200μs | 19.2976 KOps/s | 18.9323 KOps/s | |
test_tdseq | 61.7450μs | 30.5692μs | 32.7127 KOps/s | 33.1453 KOps/s | |
test_tdseq_dispatch | 91.4810μs | 56.6603μs | 17.6490 KOps/s | 17.8856 KOps/s | |
test_instantiation_functorch | 1.9173ms | 1.5725ms | 635.9247 Ops/s | 630.8434 Ops/s | |
test_exec_functorch | 0.4119ms | 0.1833ms | 5.4559 KOps/s | 5.1523 KOps/s | |
test_exec_functional_call | 0.4564ms | 0.1760ms | 5.6810 KOps/s | 5.5829 KOps/s | |
test_exec_td_decorator | 0.5478ms | 0.2404ms | 4.1604 KOps/s | 4.1515 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.0277ms | 0.6573ms | 1.5213 KOps/s | 1.5011 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.1779ms | 0.6711ms | 1.4902 KOps/s | 1.5076 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.9259ms | 0.5285ms | 1.8920 KOps/s | 1.8599 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 1.0169ms | 0.5266ms | 1.8989 KOps/s | 1.8561 KOps/s | |
test_to_module_speed[True] | 1.9293ms | 1.3464ms | 742.7464 Ops/s | 743.2460 Ops/s | |
test_to_module_speed[False] | 1.8546ms | 1.3201ms | 757.5334 Ops/s | 761.7389 Ops/s | |
test_tc_init | 97.0210μs | 48.4018μs | 20.6604 KOps/s | 21.5577 KOps/s | |
test_tc_init_nested | 0.2053ms | 94.7989μs | 10.5486 KOps/s | 10.9372 KOps/s | |
test_tc_first_layer_tensor | 38.7820μs | 1.5296μs | 653.7659 KOps/s | 652.8444 KOps/s | |
test_tc_first_layer_nontensor | 22.1110μs | 4.8322μs | 206.9457 KOps/s | 217.2047 KOps/s | |
test_tc_second_layer_tensor | 23.8650μs | 2.8924μs | 345.7374 KOps/s | 345.9703 KOps/s | |
test_tc_second_layer_nontensor | 25.0270μs | 6.1432μs | 162.7807 KOps/s | 165.2734 KOps/s | |
test_unbind | 0.2339s | 14.2058ms | 70.3939 Ops/s | 64.1689 Ops/s | |
test_full_like | 9.4121ms | 8.5529ms | 116.9191 Ops/s | 144.5011 Ops/s | |
test_zeros_like | 5.4909ms | 2.8696ms | 348.4862 Ops/s | 352.1488 Ops/s | |
test_ones_like | 6.1902ms | 3.2324ms | 309.3684 Ops/s | 314.2033 Ops/s | |
test_clone | 8.6230ms | 6.4564ms | 154.8856 Ops/s | 200.8634 Ops/s | |
test_squeeze | 61.3250μs | 13.3919μs | 74.6718 KOps/s | 75.7175 KOps/s | |
test_unsqueeze | 0.3158ms | 98.9033μs | 10.1109 KOps/s | 10.4323 KOps/s | |
test_split | 0.4916ms | 0.2042ms | 4.8966 KOps/s | 4.9851 KOps/s | |
test_permute | 0.3234ms | 0.2092ms | 4.7795 KOps/s | 4.8791 KOps/s | |
test_stack | 29.3939ms | 24.8833ms | 40.1877 Ops/s | 38.9019 Ops/s | |
test_cat | 28.1939ms | 24.7981ms | 40.3257 Ops/s | 39.4280 Ops/s |
…is passed" relax this check as [LLM](https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/llm.py#L53) in vLLM does not subclass nn.Module [ghstack-poisoned]
mikaylagawarecki
added a commit
that referenced
this pull request
Feb 28, 2025
ghstack-source-id: 5ab78a97494ffd2fedd8537184ef3b2ff20fd2ae Pull Request resolved: #1242
vmoens
approved these changes
Feb 28, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM thanks!
0c22b9a
into
gh/mikaylagawarecki/1/base
48 of 49 checks passed
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
bug
Something isn't working
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
relax this check as LLM in vLLM does not subclass nn.Module and this should be allowed when method is a callable
Stack from ghstack (oldest at bottom):