Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bugfix] Allow non Module in TensorDictModule when method is passed #1242

Merged

Conversation

mikaylagawarecki
Copy link

@mikaylagawarecki mikaylagawarecki commented Feb 28, 2025

relax this check as LLM in vLLM does not subclass nn.Module and this should be allowed when method is a callable

Stack from ghstack (oldest at bottom):

mikaylagawarecki added a commit that referenced this pull request Feb 28, 2025
ghstack-source-id: c73d86f924dd54060e9dfcf5b344a296deb1f6bf
Pull Request resolved: #1242
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 28, 2025
mikaylagawarecki added a commit that referenced this pull request Feb 28, 2025
ghstack-source-id: 797b6ef132ef2481c7addb968b3eef8f9b3e49bd
Pull Request resolved: #1242
mikaylagawarecki added a commit that referenced this pull request Feb 28, 2025
ghstack-source-id: 9788c6b55172fc24bde2315c32d54ef694d44b5c
Pull Request resolved: #1242
Copy link

github-actions bot commented Feb 28, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 217. Improved: $\large\color{#35bf28}12$. Worsened: $\large\color{#d91a1a}11$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 39.4840μs 20.6973μs 48.3155 KOps/s 48.3898 KOps/s $\color{#d91a1a}-0.15\%$
test_plain_set_stack_nested 84.2470μs 20.9794μs 47.6658 KOps/s 47.8180 KOps/s $\color{#d91a1a}-0.32\%$
test_plain_set_nested_inplace 47.7490μs 22.6997μs 44.0535 KOps/s 43.4201 KOps/s $\color{#35bf28}+1.46\%$
test_plain_set_stack_nested_inplace 0.1307ms 22.7573μs 43.9420 KOps/s 43.4835 KOps/s $\color{#35bf28}+1.05\%$
test_items 0.1576ms 4.3877μs 227.9123 KOps/s 242.7472 KOps/s $\textbf{\color{#d91a1a}-6.11\%}$
test_items_nested 0.8176ms 0.4072ms 2.4559 KOps/s 2.4012 KOps/s $\color{#35bf28}+2.28\%$
test_items_nested_locked 0.5772ms 0.4091ms 2.4442 KOps/s 2.3891 KOps/s $\color{#35bf28}+2.31\%$
test_items_nested_leaf 0.1420ms 77.1848μs 12.9559 KOps/s 12.7215 KOps/s $\color{#35bf28}+1.84\%$
test_items_stack_nested 0.5766ms 0.4084ms 2.4485 KOps/s 2.3434 KOps/s $\color{#35bf28}+4.49\%$
test_items_stack_nested_leaf 0.1530ms 77.4556μs 12.9106 KOps/s 12.7196 KOps/s $\color{#35bf28}+1.50\%$
test_items_stack_nested_locked 0.9809ms 0.4112ms 2.4319 KOps/s 2.4175 KOps/s $\color{#35bf28}+0.59\%$
test_keys 38.8220μs 3.5892μs 278.6145 KOps/s 288.6873 KOps/s $\color{#d91a1a}-3.49\%$
test_keys_nested 0.3254ms 0.1642ms 6.0913 KOps/s 6.0212 KOps/s $\color{#35bf28}+1.16\%$
test_keys_nested_locked 1.7546ms 0.1713ms 5.8393 KOps/s 5.8068 KOps/s $\color{#35bf28}+0.56\%$
test_keys_nested_leaf 0.2445ms 0.1435ms 6.9682 KOps/s 6.9240 KOps/s $\color{#35bf28}+0.64\%$
test_keys_stack_nested 0.2491ms 0.1648ms 6.0674 KOps/s 6.0954 KOps/s $\color{#d91a1a}-0.46\%$
test_keys_stack_nested_leaf 0.2318ms 0.1436ms 6.9617 KOps/s 6.9199 KOps/s $\color{#35bf28}+0.60\%$
test_keys_stack_nested_locked 0.2603ms 0.1712ms 5.8428 KOps/s 5.8157 KOps/s $\color{#35bf28}+0.47\%$
test_values 10.9284μs 1.0413μs 960.2971 KOps/s 952.8237 KOps/s $\color{#35bf28}+0.78\%$
test_values_nested 0.1127ms 63.0468μs 15.8612 KOps/s 16.0291 KOps/s $\color{#d91a1a}-1.05\%$
test_values_nested_locked 0.1425ms 63.1960μs 15.8238 KOps/s 15.5449 KOps/s $\color{#35bf28}+1.79\%$
test_values_nested_leaf 0.1508ms 72.9249μs 13.7127 KOps/s 13.9434 KOps/s $\color{#d91a1a}-1.65\%$
test_values_stack_nested 0.1233ms 63.4180μs 15.7684 KOps/s 15.9661 KOps/s $\color{#d91a1a}-1.24\%$
test_values_stack_nested_leaf 0.1304ms 73.1507μs 13.6704 KOps/s 13.9139 KOps/s $\color{#d91a1a}-1.75\%$
test_values_stack_nested_locked 0.1134ms 62.7849μs 15.9274 KOps/s 15.9026 KOps/s $\color{#35bf28}+0.16\%$
test_membership 40.9560μs 0.8914μs 1.1218 MOps/s 1.4324 MOps/s $\textbf{\color{#d91a1a}-21.68\%}$
test_membership_nested 44.7740μs 2.9346μs 340.7579 KOps/s 341.1142 KOps/s $\color{#d91a1a}-0.10\%$
test_membership_nested_leaf 58.3190μs 2.9577μs 338.0983 KOps/s 351.1714 KOps/s $\color{#d91a1a}-3.72\%$
test_membership_stacked_nested 27.0000μs 2.9253μs 341.8453 KOps/s 348.3559 KOps/s $\color{#d91a1a}-1.87\%$
test_membership_stacked_nested_leaf 46.4070μs 2.9619μs 337.6219 KOps/s 347.7462 KOps/s $\color{#d91a1a}-2.91\%$
test_membership_nested_last 35.1960μs 4.4635μs 224.0414 KOps/s 230.9398 KOps/s $\color{#d91a1a}-2.99\%$
test_membership_nested_leaf_last 45.5550μs 4.5109μs 221.6860 KOps/s 230.6347 KOps/s $\color{#d91a1a}-3.88\%$
test_membership_stacked_nested_last 17.4320μs 4.4704μs 223.6940 KOps/s 228.0520 KOps/s $\color{#d91a1a}-1.91\%$
test_membership_stacked_nested_leaf_last 31.8490μs 4.4412μs 225.1630 KOps/s 231.0488 KOps/s $\color{#d91a1a}-2.55\%$
test_nested_getleaf 32.5710μs 10.8662μs 92.0283 KOps/s 92.1369 KOps/s $\color{#d91a1a}-0.12\%$
test_nested_get 49.5920μs 10.3061μs 97.0296 KOps/s 97.3916 KOps/s $\color{#d91a1a}-0.37\%$
test_stacked_getleaf 50.6250μs 10.8141μs 92.4720 KOps/s 93.5993 KOps/s $\color{#d91a1a}-1.20\%$
test_stacked_get 50.0630μs 10.0608μs 99.3953 KOps/s 96.9249 KOps/s $\color{#35bf28}+2.55\%$
test_nested_getitemleaf 52.8590μs 11.4836μs 87.0811 KOps/s 88.2054 KOps/s $\color{#d91a1a}-1.27\%$
test_nested_getitem 39.9140μs 10.9520μs 91.3078 KOps/s 92.1901 KOps/s $\color{#d91a1a}-0.96\%$
test_stacked_getitemleaf 47.5580μs 11.5341μs 86.6996 KOps/s 87.8900 KOps/s $\color{#d91a1a}-1.35\%$
test_stacked_getitem 51.8770μs 10.9609μs 91.2334 KOps/s 91.4731 KOps/s $\color{#d91a1a}-0.26\%$
test_lock_nested 0.7369ms 0.4159ms 2.4047 KOps/s 2.4040 KOps/s $\color{#35bf28}+0.03\%$
test_lock_stack_nested 0.8525ms 0.4249ms 2.3537 KOps/s 2.2971 KOps/s $\color{#35bf28}+2.47\%$
test_unlock_nested 0.4793ms 0.3406ms 2.9359 KOps/s 2.9128 KOps/s $\color{#35bf28}+0.79\%$
test_unlock_stack_nested 0.7019ms 0.3454ms 2.8948 KOps/s 2.8575 KOps/s $\color{#35bf28}+1.31\%$
test_flatten_speed 0.1912ms 0.1014ms 9.8608 KOps/s 9.8251 KOps/s $\color{#35bf28}+0.36\%$
test_unflatten_speed 0.6263ms 0.5302ms 1.8860 KOps/s 1.9183 KOps/s $\color{#d91a1a}-1.68\%$
test_common_ops 0.9965ms 0.8233ms 1.2147 KOps/s 1.1931 KOps/s $\color{#35bf28}+1.81\%$
test_creation 23.4840μs 2.4995μs 400.0761 KOps/s 407.0284 KOps/s $\color{#d91a1a}-1.71\%$
test_creation_empty 30.5770μs 12.5170μs 79.8912 KOps/s 82.1345 KOps/s $\color{#d91a1a}-2.73\%$
test_creation_nested_1 48.8910μs 15.6902μs 63.7339 KOps/s 65.3342 KOps/s $\color{#d91a1a}-2.45\%$
test_creation_nested_2 66.5740μs 20.3030μs 49.2538 KOps/s 49.7384 KOps/s $\color{#d91a1a}-0.97\%$
test_clone 85.4190μs 13.8601μs 72.1497 KOps/s 69.1864 KOps/s $\color{#35bf28}+4.28\%$
test_getitem[int] 0.8512ms 12.9701μs 77.1002 KOps/s 73.1661 KOps/s $\textbf{\color{#35bf28}+5.38\%}$
test_getitem[slice_int] 0.1304ms 24.8971μs 40.1653 KOps/s 40.6564 KOps/s $\color{#d91a1a}-1.21\%$
test_getitem[range] 0.1668ms 51.2457μs 19.5138 KOps/s 19.3747 KOps/s $\color{#35bf28}+0.72\%$
test_getitem[tuple] 0.1238ms 20.3794μs 49.0691 KOps/s 48.2062 KOps/s $\color{#35bf28}+1.79\%$
test_getitem[list] 0.1587ms 46.9183μs 21.3137 KOps/s 21.2680 KOps/s $\color{#35bf28}+0.21\%$
test_setitem_dim[int] 80.2490μs 25.6508μs 38.9851 KOps/s 38.5098 KOps/s $\color{#35bf28}+1.23\%$
test_setitem_dim[slice_int] 0.1130ms 51.9713μs 19.2414 KOps/s 19.4859 KOps/s $\color{#d91a1a}-1.26\%$
test_setitem_dim[range] 0.1395ms 76.2449μs 13.1156 KOps/s 13.0028 KOps/s $\color{#35bf28}+0.87\%$
test_setitem_dim[tuple] 81.1910μs 40.7243μs 24.5553 KOps/s 24.0970 KOps/s $\color{#35bf28}+1.90\%$
test_setitem 62.8470μs 21.2282μs 47.1072 KOps/s 45.3539 KOps/s $\color{#35bf28}+3.87\%$
test_set 0.1033ms 20.8298μs 48.0082 KOps/s 46.7752 KOps/s $\color{#35bf28}+2.64\%$
test_set_shared 0.3366ms 0.1820ms 5.4945 KOps/s 5.4134 KOps/s $\color{#35bf28}+1.50\%$
test_update 0.1449ms 26.7177μs 37.4284 KOps/s 36.2844 KOps/s $\color{#35bf28}+3.15\%$
test_update_nested 0.1033ms 42.4913μs 23.5343 KOps/s 23.2048 KOps/s $\color{#35bf28}+1.42\%$
test_update__nested 0.4115ms 34.2956μs 29.1583 KOps/s 28.5224 KOps/s $\color{#35bf28}+2.23\%$
test_set_nested 0.1007ms 24.4684μs 40.8691 KOps/s 42.8729 KOps/s $\color{#d91a1a}-4.67\%$
test_set_nested_new 84.9890μs 27.9326μs 35.8005 KOps/s 35.0435 KOps/s $\color{#35bf28}+2.16\%$
test_select 94.0050μs 43.7451μs 22.8597 KOps/s 22.5247 KOps/s $\color{#35bf28}+1.49\%$
test_select_nested 0.1216ms 62.4368μs 16.0162 KOps/s 15.7986 KOps/s $\color{#35bf28}+1.38\%$
test_exclude_nested 0.1733ms 81.6915μs 12.2412 KOps/s 12.2859 KOps/s $\color{#d91a1a}-0.36\%$
test_empty[True] 0.4731ms 0.4095ms 2.4421 KOps/s 2.4243 KOps/s $\color{#35bf28}+0.74\%$
test_empty[False] 7.1583μs 1.3940μs 717.3662 KOps/s 745.5153 KOps/s $\color{#d91a1a}-3.78\%$
test_unbind_speed 0.3934ms 0.2697ms 3.7083 KOps/s 3.6217 KOps/s $\color{#35bf28}+2.39\%$
test_unbind_speed_stack0 0.6065ms 0.2664ms 3.7538 KOps/s 3.6632 KOps/s $\color{#35bf28}+2.47\%$
test_unbind_speed_stack1 97.0589ms 0.7210ms 1.3870 KOps/s 1.2291 KOps/s $\textbf{\color{#35bf28}+12.85\%}$
test_split 93.6195ms 1.7502ms 571.3726 Ops/s 562.0086 Ops/s $\color{#35bf28}+1.67\%$
test_chunk 0.1064s 1.7673ms 565.8459 Ops/s 621.5714 Ops/s $\textbf{\color{#d91a1a}-8.97\%}$
test_consolidate_njt[False-None] 8.7011ms 8.2903ms 120.6222 Ops/s 110.9778 Ops/s $\textbf{\color{#35bf28}+8.69\%}$
test_creation[device0] 3.9442ms 93.7792μs 10.6634 KOps/s 10.5895 KOps/s $\color{#35bf28}+0.70\%$
test_creation_from_tensor 0.2213ms 93.6355μs 10.6797 KOps/s 10.4082 KOps/s $\color{#35bf28}+2.61\%$
test_add_one[memmap_tensor0] 76.3130μs 4.7787μs 209.2637 KOps/s 180.9273 KOps/s $\textbf{\color{#35bf28}+15.66\%}$
test_contiguous[memmap_tensor0] 15.4290μs 0.5174μs 1.9329 MOps/s 1.7870 MOps/s $\textbf{\color{#35bf28}+8.16\%}$
test_stack[memmap_tensor0] 25.8790μs 3.3650μs 297.1770 KOps/s 275.8783 KOps/s $\textbf{\color{#35bf28}+7.72\%}$
test_memmaptd_index 0.2985ms 0.2260ms 4.4246 KOps/s 4.2762 KOps/s $\color{#35bf28}+3.47\%$
test_memmaptd_index_astensor 1.0591ms 0.3143ms 3.1817 KOps/s 3.1230 KOps/s $\color{#35bf28}+1.88\%$
test_memmaptd_index_op 0.7934ms 0.5899ms 1.6951 KOps/s 1.6138 KOps/s $\textbf{\color{#35bf28}+5.04\%}$
test_serialize_model 0.2068s 0.1291s 7.7489 Ops/s 8.5096 Ops/s $\textbf{\color{#d91a1a}-8.94\%}$
test_serialize_model_pickle 0.5048s 0.4081s 2.4504 Ops/s 2.5525 Ops/s $\color{#d91a1a}-4.00\%$
test_serialize_weights 0.1191s 0.1133s 8.8291 Ops/s 8.8466 Ops/s $\color{#d91a1a}-0.20\%$
test_serialize_weights_returnearly 0.1688s 0.1594s 6.2727 Ops/s 6.3932 Ops/s $\color{#d91a1a}-1.88\%$
test_serialize_weights_pickle 0.5715s 0.4963s 2.0150 Ops/s 2.4433 Ops/s $\textbf{\color{#d91a1a}-17.53\%}$
test_serialize_weights_filesystem 0.2458s 0.1549s 6.4567 Ops/s 7.0547 Ops/s $\textbf{\color{#d91a1a}-8.48\%}$
test_serialize_model_filesystem 0.1539s 0.1450s 6.8951 Ops/s 6.7217 Ops/s $\color{#35bf28}+2.58\%$
test_reshape_pytree 67.3250μs 26.3084μs 38.0106 KOps/s 32.5787 KOps/s $\textbf{\color{#35bf28}+16.67\%}$
test_reshape_td 66.4040μs 32.9388μs 30.3593 KOps/s 29.0219 KOps/s $\color{#35bf28}+4.61\%$
test_view_pytree 82.1630μs 26.4064μs 37.8696 KOps/s 37.8985 KOps/s $\color{#d91a1a}-0.08\%$
test_view_td 99.5060μs 40.7157μs 24.5605 KOps/s 24.4248 KOps/s $\color{#35bf28}+0.56\%$
test_unbind_pytree 77.7950μs 29.9039μs 33.4404 KOps/s 34.2540 KOps/s $\color{#d91a1a}-2.38\%$
test_unbind_td 0.3349ms 40.2389μs 24.8516 KOps/s 24.8179 KOps/s $\color{#35bf28}+0.14\%$
test_split_pytree 63.9900μs 29.3108μs 34.1171 KOps/s 34.9267 KOps/s $\color{#d91a1a}-2.32\%$
test_split_td 0.1962ms 46.5531μs 21.4809 KOps/s 21.6642 KOps/s $\color{#d91a1a}-0.85\%$
test_add_pytree 77.9750μs 35.3741μs 28.2693 KOps/s 26.6212 KOps/s $\textbf{\color{#35bf28}+6.19\%}$
test_add_td 0.1611ms 57.2387μs 17.4707 KOps/s 16.4747 KOps/s $\textbf{\color{#35bf28}+6.05\%}$
test_compile_add_one_nested[tensordict-compile] 0.1465ms 67.8773μs 14.7325 KOps/s 14.2155 KOps/s $\color{#35bf28}+3.64\%$
test_compile_add_one_nested[tensordict-eager] 1.4163ms 0.1713ms 5.8378 KOps/s 5.7265 KOps/s $\color{#35bf28}+1.94\%$
test_compile_add_one_nested[pytree-compile] 0.1029ms 46.4360μs 21.5350 KOps/s 20.6754 KOps/s $\color{#35bf28}+4.16\%$
test_compile_add_one_nested[pytree-eager] 0.2588ms 0.1183ms 8.4531 KOps/s 8.1517 KOps/s $\color{#35bf28}+3.70\%$
test_compile_copy_nested[tensordict-compile] 67.2060μs 28.7234μs 34.8149 KOps/s 35.6957 KOps/s $\color{#d91a1a}-2.47\%$
test_compile_copy_nested[tensordict-eager] 0.1343ms 58.3376μs 17.1416 KOps/s 16.6393 KOps/s $\color{#35bf28}+3.02\%$
test_compile_copy_nested[pytree-compile] 0.1642ms 79.7209μs 12.5438 KOps/s 12.5549 KOps/s $\color{#d91a1a}-0.09\%$
test_compile_copy_nested[pytree-eager] 0.1399ms 67.3793μs 14.8413 KOps/s 15.0756 KOps/s $\color{#d91a1a}-1.55\%$
test_compile_add_one_flat[tensordict-compile] 0.2000ms 0.1081ms 9.2474 KOps/s 9.0558 KOps/s $\color{#35bf28}+2.12\%$
test_compile_add_one_flat[tensordict-eager] 0.4624ms 0.2183ms 4.5818 KOps/s 4.6286 KOps/s $\color{#d91a1a}-1.01\%$
test_compile_add_one_flat[tensorclass-compile] 0.1035ms 48.2357μs 20.7315 KOps/s 20.2880 KOps/s $\color{#35bf28}+2.19\%$
test_compile_add_one_flat[tensorclass-eager] 0.1224ms 67.1462μs 14.8929 KOps/s 14.4788 KOps/s $\color{#35bf28}+2.86\%$
test_compile_add_one_flat[pytree-compile] 0.2163ms 0.1019ms 9.8172 KOps/s 9.5475 KOps/s $\color{#35bf28}+2.82\%$
test_compile_add_one_flat[pytree-eager] 0.3851ms 0.2015ms 4.9623 KOps/s 4.7970 KOps/s $\color{#35bf28}+3.45\%$
test_compile_add_self_flat[tensordict-eager] 0.4646ms 0.2329ms 4.2940 KOps/s 4.1815 KOps/s $\color{#35bf28}+2.69\%$
test_compile_add_self_flat[tensordict-compile] 0.2303ms 0.1148ms 8.7091 KOps/s 8.9820 KOps/s $\color{#d91a1a}-3.04\%$
test_compile_add_self_flat[tensorclass-eager] 0.1471ms 64.0086μs 15.6229 KOps/s 15.1084 KOps/s $\color{#35bf28}+3.41\%$
test_compile_add_self_flat[tensorclass-compile] 0.1109ms 49.7522μs 20.0996 KOps/s 19.5674 KOps/s $\color{#35bf28}+2.72\%$
test_compile_add_self_flat[pytree-eager] 0.3602ms 0.1578ms 6.3379 KOps/s 6.1936 KOps/s $\color{#35bf28}+2.33\%$
test_compile_add_self_flat[pytree-compile] 0.2145ms 0.1039ms 9.6260 KOps/s 9.5530 KOps/s $\color{#35bf28}+0.76\%$
test_compile_copy_flat[tensordict-compile] 64.3900μs 21.7545μs 45.9676 KOps/s 45.0968 KOps/s $\color{#35bf28}+1.93\%$
test_compile_copy_flat[tensordict-eager] 0.1477ms 67.4215μs 14.8321 KOps/s 14.7614 KOps/s $\color{#35bf28}+0.48\%$
test_compile_copy_flat[pytree-compile] 0.1591ms 84.6391μs 11.8149 KOps/s 12.3174 KOps/s $\color{#d91a1a}-4.08\%$
test_compile_copy_flat[pytree-eager] 0.1353ms 68.2653μs 14.6487 KOps/s 15.0353 KOps/s $\color{#d91a1a}-2.57\%$
test_compile_assign_and_add[tensordict-compile] 0.2960ms 0.2151ms 4.6496 KOps/s 4.5702 KOps/s $\color{#35bf28}+1.74\%$
test_compile_assign_and_add[tensordict-eager] 2.6497ms 1.3798ms 724.7549 Ops/s 710.8960 Ops/s $\color{#35bf28}+1.95\%$
test_compile_assign_and_add[pytree-compile] 0.3000ms 0.2122ms 4.7125 KOps/s 4.6589 KOps/s $\color{#35bf28}+1.15\%$
test_compile_assign_and_add[pytree-eager] 1.0480ms 0.8181ms 1.2223 KOps/s 1.1641 KOps/s $\color{#35bf28}+5.00\%$
test_compile_assign_and_add_stack[compile] 0.8134ms 0.4552ms 2.1968 KOps/s 2.1717 KOps/s $\color{#35bf28}+1.15\%$
test_compile_assign_and_add_stack[eager] 5.7102ms 2.7643ms 361.7542 Ops/s 359.3660 Ops/s $\color{#35bf28}+0.66\%$
test_compile_indexing[tensor-tensordict-compile] 0.1526ms 41.3284μs 24.1964 KOps/s 24.7275 KOps/s $\color{#d91a1a}-2.15\%$
test_compile_indexing[tensor-tensordict-eager] 0.5828ms 34.0300μs 29.3859 KOps/s 28.2949 KOps/s $\color{#35bf28}+3.86\%$
test_compile_indexing[tensor-tensorclass-compile] 89.8870μs 31.4376μs 31.8090 KOps/s 30.6849 KOps/s $\color{#35bf28}+3.66\%$
test_compile_indexing[tensor-tensorclass-eager] 55.8140μs 23.6015μs 42.3703 KOps/s 42.2404 KOps/s $\color{#35bf28}+0.31\%$
test_compile_indexing[tensor-pytree-compile] 0.1180ms 32.6886μs 30.5917 KOps/s 30.0996 KOps/s $\color{#35bf28}+1.63\%$
test_compile_indexing[tensor-pytree-eager] 80.8110μs 23.6001μs 42.3728 KOps/s 41.8338 KOps/s $\color{#35bf28}+1.29\%$
test_compile_indexing[slice-tensordict-compile] 0.1095ms 54.4487μs 18.3659 KOps/s 18.5874 KOps/s $\color{#d91a1a}-1.19\%$
test_compile_indexing[slice-tensordict-eager] 0.3758ms 20.4365μs 48.9322 KOps/s 48.5152 KOps/s $\color{#35bf28}+0.86\%$
test_compile_indexing[slice-tensorclass-compile] 0.1262ms 47.8358μs 20.9048 KOps/s 21.0717 KOps/s $\color{#d91a1a}-0.79\%$
test_compile_indexing[slice-tensorclass-eager] 54.6420μs 18.6170μs 53.7142 KOps/s 52.5382 KOps/s $\color{#35bf28}+2.24\%$
test_compile_indexing[slice-pytree-compile] 0.1090ms 48.3288μs 20.6916 KOps/s 20.7158 KOps/s $\color{#d91a1a}-0.12\%$
test_compile_indexing[slice-pytree-eager] 53.2300μs 18.7249μs 53.4048 KOps/s 53.5705 KOps/s $\color{#d91a1a}-0.31\%$
test_compile_indexing[int-tensordict-compile] 0.1359ms 55.2799μs 18.0898 KOps/s 18.1405 KOps/s $\color{#d91a1a}-0.28\%$
test_compile_indexing[int-tensordict-eager] 0.8709ms 19.9942μs 50.0145 KOps/s 49.3710 KOps/s $\color{#35bf28}+1.30\%$
test_compile_indexing[int-tensorclass-compile] 99.9870μs 48.2106μs 20.7423 KOps/s 20.7412 KOps/s $+0.01\%$
test_compile_indexing[int-tensorclass-eager] 59.7310μs 18.8827μs 52.9586 KOps/s 52.1518 KOps/s $\color{#35bf28}+1.55\%$
test_compile_indexing[int-pytree-compile] 0.1299ms 48.0431μs 20.8147 KOps/s 20.4647 KOps/s $\color{#35bf28}+1.71\%$
test_compile_indexing[int-pytree-eager] 73.0160μs 18.8340μs 53.0955 KOps/s 53.4751 KOps/s $\color{#d91a1a}-0.71\%$
test_mod_add[eager] 0.1118ms 37.7161μs 26.5139 KOps/s 27.1829 KOps/s $\color{#d91a1a}-2.46\%$
test_mod_add[compile] 0.1500ms 69.7019μs 14.3468 KOps/s 14.4690 KOps/s $\color{#d91a1a}-0.84\%$
test_mod_add[compile-overhead] 0.1484ms 67.2378μs 14.8726 KOps/s 14.6809 KOps/s $\color{#35bf28}+1.31\%$
test_mod_wrap[eager] 0.3678ms 0.2308ms 4.3328 KOps/s 4.2516 KOps/s $\color{#35bf28}+1.91\%$
test_mod_wrap[compile] 2.0931ms 0.2320ms 4.3111 KOps/s 4.2481 KOps/s $\color{#35bf28}+1.48\%$
test_mod_wrap[compile-overhead] 0.3615ms 0.2268ms 4.4087 KOps/s 4.3405 KOps/s $\color{#35bf28}+1.57\%$
test_mod_wrap_and_backward[eager] 16.3540ms 13.4866ms 74.1475 Ops/s 92.2778 Ops/s $\textbf{\color{#d91a1a}-19.65\%}$
test_mod_wrap_and_backward[compile] 16.2829ms 13.4535ms 74.3303 Ops/s 88.5496 Ops/s $\textbf{\color{#d91a1a}-16.06\%}$
test_mod_wrap_and_backward[compile-overhead] 17.9116ms 13.4699ms 74.2397 Ops/s 87.3015 Ops/s $\textbf{\color{#d91a1a}-14.96\%}$
test_seq_add[eager] 0.2263ms 0.1257ms 7.9560 KOps/s 8.1632 KOps/s $\color{#d91a1a}-2.54\%$
test_seq_add[compile] 0.2176ms 81.9649μs 12.2003 KOps/s 12.5210 KOps/s $\color{#d91a1a}-2.56\%$
test_seq_add[compile-overhead] 0.1503ms 78.9578μs 12.6650 KOps/s 12.9575 KOps/s $\color{#d91a1a}-2.26\%$
test_seq_wrap[eager] 0.7526ms 0.4593ms 2.1771 KOps/s 2.1511 KOps/s $\color{#35bf28}+1.21\%$
test_seq_wrap[compile] 0.4609ms 0.2602ms 3.8436 KOps/s 3.9842 KOps/s $\color{#d91a1a}-3.53\%$
test_seq_wrap[compile-overhead] 0.3850ms 0.2493ms 4.0108 KOps/s 3.9735 KOps/s $\color{#35bf28}+0.94\%$
test_func_call_runtime[False-eager] 0.9011ms 0.5538ms 1.8059 KOps/s 1.7577 KOps/s $\color{#35bf28}+2.74\%$
test_func_call_runtime[False-compile] 0.6564ms 0.4510ms 2.2171 KOps/s 2.1964 KOps/s $\color{#35bf28}+0.94\%$
test_func_call_runtime[False-compile-overhead] 0.8464ms 0.4468ms 2.2380 KOps/s 2.2121 KOps/s $\color{#35bf28}+1.17\%$
test_func_call_runtime[True-eager] 0.9961ms 0.7537ms 1.3268 KOps/s 1.2961 KOps/s $\color{#35bf28}+2.37\%$
test_func_call_runtime[True-compile] 0.5565ms 0.4615ms 2.1667 KOps/s 2.1153 KOps/s $\color{#35bf28}+2.43\%$
test_func_call_runtime[True-compile-overhead] 0.8385ms 0.4653ms 2.1491 KOps/s 2.0908 KOps/s $\color{#35bf28}+2.79\%$
test_func_call_cm_runtime[False-eager] 0.7563ms 0.5374ms 1.8609 KOps/s 1.7782 KOps/s $\color{#35bf28}+4.65\%$
test_func_call_cm_runtime[False-compile] 0.8125ms 0.4412ms 2.2668 KOps/s 2.2075 KOps/s $\color{#35bf28}+2.68\%$
test_func_call_cm_runtime[False-compile-overhead] 0.8153ms 0.4448ms 2.2484 KOps/s 2.2091 KOps/s $\color{#35bf28}+1.78\%$
test_func_call_cm_runtime[True-eager] 1.0939ms 0.9124ms 1.0960 KOps/s 1.0948 KOps/s $\color{#35bf28}+0.11\%$
test_func_call_cm_runtime[True-compile] 1.5814ms 0.8091ms 1.2360 KOps/s 1.2252 KOps/s $\color{#35bf28}+0.88\%$
test_func_call_cm_runtime[True-compile-overhead] 1.0274ms 0.8175ms 1.2232 KOps/s 1.1988 KOps/s $\color{#35bf28}+2.03\%$
test_vmap_func_call_cm_runtime[eager] 2.7035ms 1.9164ms 521.8015 Ops/s 518.1890 Ops/s $\color{#35bf28}+0.70\%$
test_vmap_func_call_cm_runtime[compile] 0.6633ms 0.5429ms 1.8419 KOps/s 1.8158 KOps/s $\color{#35bf28}+1.44\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.7538ms 0.5471ms 1.8280 KOps/s 1.8286 KOps/s $\color{#d91a1a}-0.03\%$
test_distributed 0.2820ms 0.1267ms 7.8935 KOps/s 7.7582 KOps/s $\color{#35bf28}+1.74\%$
test_tdmodule 54.9520μs 28.3256μs 35.3037 KOps/s 35.2134 KOps/s $\color{#35bf28}+0.26\%$
test_tdmodule_dispatch 81.6330μs 51.8200μs 19.2976 KOps/s 18.9323 KOps/s $\color{#35bf28}+1.93\%$
test_tdseq 61.7450μs 30.5692μs 32.7127 KOps/s 33.1453 KOps/s $\color{#d91a1a}-1.31\%$
test_tdseq_dispatch 91.4810μs 56.6603μs 17.6490 KOps/s 17.8856 KOps/s $\color{#d91a1a}-1.32\%$
test_instantiation_functorch 1.9173ms 1.5725ms 635.9247 Ops/s 630.8434 Ops/s $\color{#35bf28}+0.81\%$
test_exec_functorch 0.4119ms 0.1833ms 5.4559 KOps/s 5.1523 KOps/s $\textbf{\color{#35bf28}+5.89\%}$
test_exec_functional_call 0.4564ms 0.1760ms 5.6810 KOps/s 5.5829 KOps/s $\color{#35bf28}+1.76\%$
test_exec_td_decorator 0.5478ms 0.2404ms 4.1604 KOps/s 4.1515 KOps/s $\color{#35bf28}+0.21\%$
test_vmap_mlp_speed_decorator[True-True] 1.0277ms 0.6573ms 1.5213 KOps/s 1.5011 KOps/s $\color{#35bf28}+1.34\%$
test_vmap_mlp_speed_decorator[True-False] 1.1779ms 0.6711ms 1.4902 KOps/s 1.5076 KOps/s $\color{#d91a1a}-1.16\%$
test_vmap_mlp_speed_decorator[False-True] 0.9259ms 0.5285ms 1.8920 KOps/s 1.8599 KOps/s $\color{#35bf28}+1.72\%$
test_vmap_mlp_speed_decorator[False-False] 1.0169ms 0.5266ms 1.8989 KOps/s 1.8561 KOps/s $\color{#35bf28}+2.31\%$
test_to_module_speed[True] 1.9293ms 1.3464ms 742.7464 Ops/s 743.2460 Ops/s $\color{#d91a1a}-0.07\%$
test_to_module_speed[False] 1.8546ms 1.3201ms 757.5334 Ops/s 761.7389 Ops/s $\color{#d91a1a}-0.55\%$
test_tc_init 97.0210μs 48.4018μs 20.6604 KOps/s 21.5577 KOps/s $\color{#d91a1a}-4.16\%$
test_tc_init_nested 0.2053ms 94.7989μs 10.5486 KOps/s 10.9372 KOps/s $\color{#d91a1a}-3.55\%$
test_tc_first_layer_tensor 38.7820μs 1.5296μs 653.7659 KOps/s 652.8444 KOps/s $\color{#35bf28}+0.14\%$
test_tc_first_layer_nontensor 22.1110μs 4.8322μs 206.9457 KOps/s 217.2047 KOps/s $\color{#d91a1a}-4.72\%$
test_tc_second_layer_tensor 23.8650μs 2.8924μs 345.7374 KOps/s 345.9703 KOps/s $\color{#d91a1a}-0.07\%$
test_tc_second_layer_nontensor 25.0270μs 6.1432μs 162.7807 KOps/s 165.2734 KOps/s $\color{#d91a1a}-1.51\%$
test_unbind 0.2339s 14.2058ms 70.3939 Ops/s 64.1689 Ops/s $\textbf{\color{#35bf28}+9.70\%}$
test_full_like 9.4121ms 8.5529ms 116.9191 Ops/s 144.5011 Ops/s $\textbf{\color{#d91a1a}-19.09\%}$
test_zeros_like 5.4909ms 2.8696ms 348.4862 Ops/s 352.1488 Ops/s $\color{#d91a1a}-1.04\%$
test_ones_like 6.1902ms 3.2324ms 309.3684 Ops/s 314.2033 Ops/s $\color{#d91a1a}-1.54\%$
test_clone 8.6230ms 6.4564ms 154.8856 Ops/s 200.8634 Ops/s $\textbf{\color{#d91a1a}-22.89\%}$
test_squeeze 61.3250μs 13.3919μs 74.6718 KOps/s 75.7175 KOps/s $\color{#d91a1a}-1.38\%$
test_unsqueeze 0.3158ms 98.9033μs 10.1109 KOps/s 10.4323 KOps/s $\color{#d91a1a}-3.08\%$
test_split 0.4916ms 0.2042ms 4.8966 KOps/s 4.9851 KOps/s $\color{#d91a1a}-1.78\%$
test_permute 0.3234ms 0.2092ms 4.7795 KOps/s 4.8791 KOps/s $\color{#d91a1a}-2.04\%$
test_stack 29.3939ms 24.8833ms 40.1877 Ops/s 38.9019 Ops/s $\color{#35bf28}+3.31\%$
test_cat 28.1939ms 24.7981ms 40.3257 Ops/s 39.4280 Ops/s $\color{#35bf28}+2.28\%$

mikaylagawarecki added a commit that referenced this pull request Feb 28, 2025
ghstack-source-id: 5ab78a97494ffd2fedd8537184ef3b2ff20fd2ae
Pull Request resolved: #1242
Copy link
Contributor

@vmoens vmoens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks!

@vmoens vmoens added the bug Something isn't working label Feb 28, 2025
@mikaylagawarecki mikaylagawarecki marked this pull request as ready for review February 28, 2025 19:35
@mikaylagawarecki mikaylagawarecki merged commit 0c22b9a into gh/mikaylagawarecki/1/base Feb 28, 2025
48 of 49 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants