Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] Remove monkey patching of uninit params #684

Merged
merged 4 commits into from
Feb 21, 2024
Merged

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Feb 21, 2024

No description provided.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 21, 2024
@vmoens vmoens added the bug Something isn't working label Feb 21, 2024
@vmoens vmoens merged commit 50f4577 into main Feb 21, 2024
33 of 34 checks passed
@vmoens vmoens deleted the remove-monkey-patch branch February 21, 2024 00:14
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 126. Improved: $\large\color{#35bf28}2$. Worsened: $\large\color{#d91a1a}12$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 42.8000μs 17.3152μs 57.7529 KOps/s 57.7273 KOps/s $\color{#35bf28}+0.04\%$
test_plain_set_stack_nested 60.6530μs 17.6028μs 56.8091 KOps/s 56.7151 KOps/s $\color{#35bf28}+0.17\%$
test_plain_set_nested_inplace 68.4970μs 19.7842μs 50.5453 KOps/s 50.1912 KOps/s $\color{#35bf28}+0.71\%$
test_plain_set_stack_nested_inplace 47.7190μs 19.6359μs 50.9270 KOps/s 50.5845 KOps/s $\color{#35bf28}+0.68\%$
test_items 14.5270μs 2.4347μs 410.7284 KOps/s 410.5515 KOps/s $\color{#35bf28}+0.04\%$
test_items_nested 1.2382ms 0.2754ms 3.6315 KOps/s 3.6411 KOps/s $\color{#d91a1a}-0.26\%$
test_items_nested_locked 0.4670ms 0.2746ms 3.6413 KOps/s 3.6612 KOps/s $\color{#d91a1a}-0.54\%$
test_items_nested_leaf 1.1243ms 0.1709ms 5.8501 KOps/s 5.9175 KOps/s $\color{#d91a1a}-1.14\%$
test_items_stack_nested 0.4523ms 0.2759ms 3.6245 KOps/s 3.6567 KOps/s $\color{#d91a1a}-0.88\%$
test_items_stack_nested_leaf 0.2952ms 0.1691ms 5.9129 KOps/s 5.8928 KOps/s $\color{#35bf28}+0.34\%$
test_items_stack_nested_locked 0.8532ms 0.2764ms 3.6175 KOps/s 3.5891 KOps/s $\color{#35bf28}+0.79\%$
test_keys 29.8860μs 4.0368μs 247.7236 KOps/s 259.1081 KOps/s $\color{#d91a1a}-4.39\%$
test_keys_nested 1.6915ms 0.1550ms 6.4535 KOps/s 6.7774 KOps/s $\color{#d91a1a}-4.78\%$
test_keys_nested_locked 3.8762ms 0.1597ms 6.2627 KOps/s 6.5674 KOps/s $\color{#d91a1a}-4.64\%$
test_keys_nested_leaf 38.5709ms 0.1394ms 7.1714 KOps/s 7.8202 KOps/s $\textbf{\color{#d91a1a}-8.30\%}$
test_keys_stack_nested 0.2501ms 0.1587ms 6.2995 KOps/s 6.7205 KOps/s $\textbf{\color{#d91a1a}-6.26\%}$
test_keys_stack_nested_leaf 0.2321ms 0.1386ms 7.2142 KOps/s 7.7709 KOps/s $\textbf{\color{#d91a1a}-7.16\%}$
test_keys_stack_nested_locked 0.2883ms 0.1624ms 6.1594 KOps/s 6.4762 KOps/s $\color{#d91a1a}-4.89\%$
test_values 5.1570μs 1.0360μs 965.2530 KOps/s 851.7632 KOps/s $\textbf{\color{#35bf28}+13.32\%}$
test_values_nested 0.1129ms 53.9426μs 18.5382 KOps/s 19.4708 KOps/s $\color{#d91a1a}-4.79\%$
test_values_nested_locked 98.5340μs 53.4074μs 18.7240 KOps/s 19.4998 KOps/s $\color{#d91a1a}-3.98\%$
test_values_nested_leaf 90.5690μs 48.1183μs 20.7821 KOps/s 21.5374 KOps/s $\color{#d91a1a}-3.51\%$
test_values_stack_nested 0.1057ms 54.6096μs 18.3118 KOps/s 19.2636 KOps/s $\color{#d91a1a}-4.94\%$
test_values_stack_nested_leaf 93.6340μs 47.9512μs 20.8545 KOps/s 21.7288 KOps/s $\color{#d91a1a}-4.02\%$
test_values_stack_nested_locked 0.1087ms 53.8482μs 18.5707 KOps/s 19.2117 KOps/s $\color{#d91a1a}-3.34\%$
test_membership 10.9810μs 1.3838μs 722.6731 KOps/s 756.2097 KOps/s $\color{#d91a1a}-4.43\%$
test_membership_nested 20.6580μs 3.4763μs 287.6601 KOps/s 292.9387 KOps/s $\color{#d91a1a}-1.80\%$
test_membership_nested_leaf 42.2790μs 3.4705μs 288.1423 KOps/s 293.0737 KOps/s $\color{#d91a1a}-1.68\%$
test_membership_stacked_nested 38.8130μs 3.4876μs 286.7329 KOps/s 294.9196 KOps/s $\color{#d91a1a}-2.78\%$
test_membership_stacked_nested_leaf 22.6220μs 3.4891μs 286.6086 KOps/s 293.2369 KOps/s $\color{#d91a1a}-2.26\%$
test_membership_nested_last 46.2560μs 6.6773μs 149.7607 KOps/s 152.5255 KOps/s $\color{#d91a1a}-1.81\%$
test_membership_nested_leaf_last 31.1380μs 6.7600μs 147.9284 KOps/s 153.3157 KOps/s $\color{#d91a1a}-3.51\%$
test_membership_stacked_nested_last 49.5720μs 7.1609μs 139.6467 KOps/s 156.0387 KOps/s $\textbf{\color{#d91a1a}-10.51\%}$
test_membership_stacked_nested_leaf_last 24.9070μs 7.1909μs 139.0643 KOps/s 153.4773 KOps/s $\textbf{\color{#d91a1a}-9.39\%}$
test_nested_getleaf 52.9590μs 10.8032μs 92.5650 KOps/s 93.1963 KOps/s $\color{#d91a1a}-0.68\%$
test_nested_get 42.9700μs 10.2479μs 97.5809 KOps/s 97.7088 KOps/s $\color{#d91a1a}-0.13\%$
test_stacked_getleaf 32.3600μs 10.8073μs 92.5298 KOps/s 92.2778 KOps/s $\color{#35bf28}+0.27\%$
test_stacked_get 47.5490μs 10.1247μs 98.7680 KOps/s 96.4497 KOps/s $\color{#35bf28}+2.40\%$
test_nested_getitemleaf 50.9850μs 12.3183μs 81.1797 KOps/s 81.9273 KOps/s $\color{#d91a1a}-0.91\%$
test_nested_getitem 35.0650μs 12.1526μs 82.2867 KOps/s 86.1737 KOps/s $\color{#d91a1a}-4.51\%$
test_stacked_getitemleaf 54.7720μs 12.4700μs 80.1926 KOps/s 80.9045 KOps/s $\color{#d91a1a}-0.88\%$
test_stacked_getitem 35.3860μs 12.0081μs 83.2771 KOps/s 85.0626 KOps/s $\color{#d91a1a}-2.10\%$
test_lock_nested 0.6608ms 0.3319ms 3.0128 KOps/s 2.9845 KOps/s $\color{#35bf28}+0.95\%$
test_lock_stack_nested 0.4097ms 0.2941ms 3.4007 KOps/s 3.3386 KOps/s $\color{#35bf28}+1.86\%$
test_unlock_nested 76.7808ms 0.4129ms 2.4221 KOps/s 2.3781 KOps/s $\color{#35bf28}+1.85\%$
test_unlock_stack_nested 0.4827ms 0.3026ms 3.3045 KOps/s 3.2384 KOps/s $\color{#35bf28}+2.04\%$
test_flatten_speed 0.6987ms 0.3850ms 2.5973 KOps/s 2.7187 KOps/s $\color{#d91a1a}-4.47\%$
test_unflatten_speed 0.7694ms 0.4651ms 2.1501 KOps/s 2.1720 KOps/s $\color{#d91a1a}-1.01\%$
test_common_ops 1.1766ms 0.6918ms 1.4455 KOps/s 1.4399 KOps/s $\color{#35bf28}+0.39\%$
test_creation 48.0000μs 1.8765μs 532.9145 KOps/s 545.9559 KOps/s $\color{#d91a1a}-2.39\%$
test_creation_empty 27.0010μs 10.7522μs 93.0040 KOps/s 90.8884 KOps/s $\color{#35bf28}+2.33\%$
test_creation_nested_1 32.2510μs 13.2823μs 75.2881 KOps/s 73.6754 KOps/s $\color{#35bf28}+2.19\%$
test_creation_nested_2 62.8870μs 16.6733μs 59.9762 KOps/s 59.1413 KOps/s $\color{#35bf28}+1.41\%$
test_clone 61.6850μs 13.4561μs 74.3157 KOps/s 77.5658 KOps/s $\color{#d91a1a}-4.19\%$
test_getitem[int] 27.9220μs 11.5271μs 86.7522 KOps/s 91.9753 KOps/s $\textbf{\color{#d91a1a}-5.68\%}$
test_getitem[slice_int] 56.2560μs 22.9637μs 43.5470 KOps/s 46.6156 KOps/s $\textbf{\color{#d91a1a}-6.58\%}$
test_getitem[range] 0.1127ms 42.2725μs 23.6561 KOps/s 24.3098 KOps/s $\color{#d91a1a}-2.69\%$
test_getitem[tuple] 50.6950μs 18.6904μs 53.5035 KOps/s 56.2397 KOps/s $\color{#d91a1a}-4.87\%$
test_getitem[list] 0.1547ms 37.1209μs 26.9390 KOps/s 27.4144 KOps/s $\color{#d91a1a}-1.73\%$
test_setitem_dim[int] 58.2980μs 31.3347μs 31.9135 KOps/s 32.3243 KOps/s $\color{#d91a1a}-1.27\%$
test_setitem_dim[slice_int] 96.2790μs 58.3348μs 17.1424 KOps/s 17.2184 KOps/s $\color{#d91a1a}-0.44\%$
test_setitem_dim[range] 0.1473ms 77.1472μs 12.9622 KOps/s 13.0595 KOps/s $\color{#d91a1a}-0.75\%$
test_setitem_dim[tuple] 82.4140μs 45.7735μs 21.8467 KOps/s 21.9951 KOps/s $\color{#d91a1a}-0.67\%$
test_setitem 65.8530μs 20.3970μs 49.0268 KOps/s 49.8065 KOps/s $\color{#d91a1a}-1.57\%$
test_set 65.8130μs 19.6417μs 50.9120 KOps/s 51.1907 KOps/s $\color{#d91a1a}-0.54\%$
test_set_shared 1.9600ms 0.1408ms 7.1011 KOps/s 7.2792 KOps/s $\color{#d91a1a}-2.45\%$
test_update 87.7950μs 22.2982μs 44.8467 KOps/s 44.4211 KOps/s $\color{#35bf28}+0.96\%$
test_update_nested 0.1047ms 30.4435μs 32.8477 KOps/s 32.8384 KOps/s $\color{#35bf28}+0.03\%$
test_set_nested 75.1000μs 21.7665μs 45.9421 KOps/s 47.4423 KOps/s $\color{#d91a1a}-3.16\%$
test_set_nested_new 85.1090μs 25.2176μs 39.6548 KOps/s 39.5417 KOps/s $\color{#35bf28}+0.29\%$
test_select 0.1082ms 38.8551μs 25.7366 KOps/s 26.0195 KOps/s $\color{#d91a1a}-1.09\%$
test_select_nested 0.1901ms 59.1607μs 16.9031 KOps/s 17.2111 KOps/s $\color{#d91a1a}-1.79\%$
test_exclude_nested 0.2205ms 0.1191ms 8.3980 KOps/s 8.6600 KOps/s $\color{#d91a1a}-3.03\%$
test_empty[True] 0.5817ms 0.4110ms 2.4330 KOps/s 2.5388 KOps/s $\color{#d91a1a}-4.16\%$
test_empty[False] 8.2012μs 1.0495μs 952.8349 KOps/s 970.8608 KOps/s $\color{#d91a1a}-1.86\%$
test_unbind_speed 0.4595ms 0.2507ms 3.9886 KOps/s 4.1173 KOps/s $\color{#d91a1a}-3.12\%$
test_unbind_speed_stack0 0.4200ms 0.2354ms 4.2488 KOps/s 4.1078 KOps/s $\color{#35bf28}+3.43\%$
test_unbind_speed_stack1 0.1271s 0.6608ms 1.5132 KOps/s 1.4763 KOps/s $\color{#35bf28}+2.50\%$
test_split 0.1122s 1.6580ms 603.1318 Ops/s 617.3734 Ops/s $\color{#d91a1a}-2.31\%$
test_chunk 2.3846ms 1.4936ms 669.5141 Ops/s 695.3519 Ops/s $\color{#d91a1a}-3.72\%$
test_creation[device0] 3.8743ms 0.1057ms 9.4596 KOps/s 9.6761 KOps/s $\color{#d91a1a}-2.24\%$
test_creation_from_tensor 0.1988ms 81.1912μs 12.3166 KOps/s 12.4380 KOps/s $\color{#d91a1a}-0.98\%$
test_add_one[memmap_tensor0] 99.2350μs 5.4866μs 182.2608 KOps/s 182.0651 KOps/s $\color{#35bf28}+0.11\%$
test_contiguous[memmap_tensor0] 16.4300μs 0.6630μs 1.5082 MOps/s 1.6082 MOps/s $\textbf{\color{#d91a1a}-6.22\%}$
test_stack[memmap_tensor0] 37.1400μs 3.7469μs 266.8908 KOps/s 282.6912 KOps/s $\textbf{\color{#d91a1a}-5.59\%}$
test_memmaptd_index 1.1629ms 0.2486ms 4.0231 KOps/s 4.2732 KOps/s $\textbf{\color{#d91a1a}-5.85\%}$
test_memmaptd_index_astensor 0.6509ms 0.3116ms 3.2089 KOps/s 3.3428 KOps/s $\color{#d91a1a}-4.00\%$
test_memmaptd_index_op 1.1214ms 0.6080ms 1.6448 KOps/s 1.6735 KOps/s $\color{#d91a1a}-1.71\%$
test_serialize_model 0.2298s 0.1173s 8.5287 Ops/s 8.4973 Ops/s $\color{#35bf28}+0.37\%$
test_serialize_model_pickle 0.4617s 0.3834s 2.6082 Ops/s 2.6188 Ops/s $\color{#d91a1a}-0.40\%$
test_serialize_weights 0.1076s 0.1001s 9.9913 Ops/s 9.8581 Ops/s $\color{#35bf28}+1.35\%$
test_serialize_weights_returnearly 0.2286s 0.1333s 7.5017 Ops/s 7.9818 Ops/s $\textbf{\color{#d91a1a}-6.01\%}$
test_serialize_weights_pickle 1.0544s 0.6115s 1.6353 Ops/s 2.2859 Ops/s $\textbf{\color{#d91a1a}-28.46\%}$
test_serialize_weights_filesystem 0.1001s 91.9654ms 10.8736 Ops/s 10.2736 Ops/s $\textbf{\color{#35bf28}+5.84\%}$
test_serialize_model_filesystem 0.1174s 94.7706ms 10.5518 Ops/s 10.7024 Ops/s $\color{#d91a1a}-1.41\%$
test_reshape_pytree 56.1250μs 21.2340μs 47.0943 KOps/s 47.7694 KOps/s $\color{#d91a1a}-1.41\%$
test_reshape_td 74.8600μs 31.6179μs 31.6277 KOps/s 32.2498 KOps/s $\color{#d91a1a}-1.93\%$
test_view_pytree 61.6640μs 21.1841μs 47.2053 KOps/s 48.4042 KOps/s $\color{#d91a1a}-2.48\%$
test_view_td 0.1204s 61.7335μs 16.1987 KOps/s 16.3941 KOps/s $\color{#d91a1a}-1.19\%$
test_unbind_pytree 58.2180μs 24.8867μs 40.1821 KOps/s 41.3000 KOps/s $\color{#d91a1a}-2.71\%$
test_unbind_td 0.4672ms 36.1827μs 27.6375 KOps/s 27.6596 KOps/s $\color{#d91a1a}-0.08\%$
test_split_pytree 56.3660μs 24.6766μs 40.5242 KOps/s 41.8950 KOps/s $\color{#d91a1a}-3.27\%$
test_split_td 0.1433ms 40.6566μs 24.5963 KOps/s 25.4526 KOps/s $\color{#d91a1a}-3.36\%$
test_add_pytree 67.6160μs 29.5749μs 33.8124 KOps/s 33.3442 KOps/s $\color{#35bf28}+1.40\%$
test_add_td 0.1191ms 53.1285μs 18.8223 KOps/s 18.3127 KOps/s $\color{#35bf28}+2.78\%$
test_distributed 0.2204ms 99.4341μs 10.0569 KOps/s 10.0404 KOps/s $\color{#35bf28}+0.16\%$
test_tdmodule 0.3339ms 22.6368μs 44.1758 KOps/s 43.1490 KOps/s $\color{#35bf28}+2.38\%$
test_tdmodule_dispatch 0.1866ms 44.0339μs 22.7098 KOps/s 22.5167 KOps/s $\color{#35bf28}+0.86\%$
test_tdseq 0.3567ms 26.3462μs 37.9561 KOps/s 39.1069 KOps/s $\color{#d91a1a}-2.94\%$
test_tdseq_dispatch 0.1404ms 47.9902μs 20.8376 KOps/s 21.1723 KOps/s $\color{#d91a1a}-1.58\%$
test_instantiation_functorch 1.5868ms 1.3205ms 757.2880 Ops/s 760.6295 Ops/s $\color{#d91a1a}-0.44\%$
test_instantiation_td 1.5167ms 1.0097ms 990.3553 Ops/s 994.8980 Ops/s $\color{#d91a1a}-0.46\%$
test_exec_functorch 0.2521ms 0.1607ms 6.2228 KOps/s 6.1933 KOps/s $\color{#35bf28}+0.48\%$
test_exec_functional_call 0.2720ms 0.1516ms 6.5976 KOps/s 6.7015 KOps/s $\color{#d91a1a}-1.55\%$
test_exec_td 0.2336ms 0.1489ms 6.7172 KOps/s 6.7914 KOps/s $\color{#d91a1a}-1.09\%$
test_exec_td_decorator 0.9461ms 0.2012ms 4.9690 KOps/s 5.0831 KOps/s $\color{#d91a1a}-2.25\%$
test_vmap_mlp_speed[True-True] 0.6158ms 0.4798ms 2.0842 KOps/s 2.0950 KOps/s $\color{#d91a1a}-0.52\%$
test_vmap_mlp_speed[True-False] 0.6803ms 0.4753ms 2.1041 KOps/s 2.1023 KOps/s $\color{#35bf28}+0.09\%$
test_vmap_mlp_speed[False-True] 0.5933ms 0.3864ms 2.5880 KOps/s 2.5663 KOps/s $\color{#35bf28}+0.85\%$
test_vmap_mlp_speed[False-False] 0.7598ms 0.3875ms 2.5804 KOps/s 2.5732 KOps/s $\color{#35bf28}+0.28\%$
test_vmap_mlp_speed_decorator[True-True] 1.1591ms 0.5250ms 1.9048 KOps/s 1.8777 KOps/s $\color{#35bf28}+1.44\%$
test_vmap_mlp_speed_decorator[True-False] 0.8639ms 0.5275ms 1.8957 KOps/s 1.8845 KOps/s $\color{#35bf28}+0.59\%$
test_vmap_mlp_speed_decorator[False-True] 0.6853ms 0.4051ms 2.4683 KOps/s 2.4709 KOps/s $\color{#d91a1a}-0.10\%$
test_vmap_mlp_speed_decorator[False-False] 0.7020ms 0.4053ms 2.4672 KOps/s 2.4660 KOps/s $\color{#35bf28}+0.05\%$
test_to_module_speed[True] 1.8907ms 1.3800ms 724.6474 Ops/s 726.7626 Ops/s $\color{#d91a1a}-0.29\%$
test_to_module_speed[False] 2.0258ms 1.3687ms 730.5983 Ops/s 727.6083 Ops/s $\color{#35bf28}+0.41\%$

Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 134. Improved: $\large\color{#35bf28}3$. Worsened: $\large\color{#d91a1a}33$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 0.6553ms 14.1939μs 70.4529 KOps/s 76.3231 KOps/s $\textbf{\color{#d91a1a}-7.69\%}$
test_plain_set_stack_nested 34.9010μs 14.6382μs 68.3144 KOps/s 76.3102 KOps/s $\textbf{\color{#d91a1a}-10.48\%}$
test_plain_set_nested_inplace 40.7410μs 15.6287μs 63.9850 KOps/s 69.1690 KOps/s $\textbf{\color{#d91a1a}-7.49\%}$
test_plain_set_stack_nested_inplace 44.5010μs 15.8665μs 63.0258 KOps/s 69.2145 KOps/s $\textbf{\color{#d91a1a}-8.94\%}$
test_items 21.8400μs 4.7499μs 210.5302 KOps/s 212.8520 KOps/s $\color{#d91a1a}-1.09\%$
test_items_nested 0.4234ms 0.3519ms 2.8419 KOps/s 2.9012 KOps/s $\color{#d91a1a}-2.04\%$
test_items_nested_locked 0.4809ms 0.3503ms 2.8545 KOps/s 2.8848 KOps/s $\color{#d91a1a}-1.05\%$
test_items_nested_leaf 0.2724ms 0.2088ms 4.7900 KOps/s 4.8958 KOps/s $\color{#d91a1a}-2.16\%$
test_items_stack_nested 0.4383ms 0.3615ms 2.7663 KOps/s 2.9000 KOps/s $\color{#d91a1a}-4.61\%$
test_items_stack_nested_leaf 0.2656ms 0.2037ms 4.9096 KOps/s 4.9468 KOps/s $\color{#d91a1a}-0.75\%$
test_items_stack_nested_locked 0.4067ms 0.3491ms 2.8647 KOps/s 2.8783 KOps/s $\color{#d91a1a}-0.47\%$
test_keys 27.6510μs 4.5908μs 217.8251 KOps/s 209.3248 KOps/s $\color{#35bf28}+4.06\%$
test_keys_nested 44.2333ms 0.1042ms 9.5976 KOps/s 10.4645 KOps/s $\textbf{\color{#d91a1a}-8.28\%}$
test_keys_nested_locked 0.1471ms 99.1359μs 10.0872 KOps/s 10.1071 KOps/s $\color{#d91a1a}-0.20\%$
test_keys_nested_leaf 0.1130ms 78.0950μs 12.8049 KOps/s 12.7908 KOps/s $\color{#35bf28}+0.11\%$
test_keys_stack_nested 0.1455ms 95.1244μs 10.5126 KOps/s 10.6169 KOps/s $\color{#d91a1a}-0.98\%$
test_keys_stack_nested_leaf 0.1226ms 78.6323μs 12.7174 KOps/s 12.8694 KOps/s $\color{#d91a1a}-1.18\%$
test_keys_stack_nested_locked 0.1433ms 99.4605μs 10.0542 KOps/s 10.1420 KOps/s $\color{#d91a1a}-0.87\%$
test_values 6.5400μs 1.9010μs 526.0506 KOps/s 530.2916 KOps/s $\color{#d91a1a}-0.80\%$
test_values_nested 64.2510μs 45.9411μs 21.7670 KOps/s 21.9230 KOps/s $\color{#d91a1a}-0.71\%$
test_values_nested_locked 70.8810μs 48.2654μs 20.7188 KOps/s 20.9232 KOps/s $\color{#d91a1a}-0.98\%$
test_values_nested_leaf 56.0710μs 40.1863μs 24.8841 KOps/s 25.0773 KOps/s $\color{#d91a1a}-0.77\%$
test_values_stack_nested 66.5710μs 46.7821μs 21.3757 KOps/s 21.6533 KOps/s $\color{#d91a1a}-1.28\%$
test_values_stack_nested_leaf 54.2310μs 39.9248μs 25.0471 KOps/s 25.1990 KOps/s $\color{#d91a1a}-0.60\%$
test_values_stack_nested_locked 66.6310μs 48.3797μs 20.6698 KOps/s 20.7344 KOps/s $\color{#d91a1a}-0.31\%$
test_membership 5.1300μs 0.9466μs 1.0565 MOps/s 1.0588 MOps/s $\color{#d91a1a}-0.22\%$
test_membership_nested 18.4510μs 2.8687μs 348.5939 KOps/s 347.7352 KOps/s $\color{#35bf28}+0.25\%$
test_membership_nested_leaf 20.7500μs 2.8520μs 350.6310 KOps/s 343.1965 KOps/s $\color{#35bf28}+2.17\%$
test_membership_stacked_nested 25.7510μs 2.8921μs 345.7721 KOps/s 344.0497 KOps/s $\color{#35bf28}+0.50\%$
test_membership_stacked_nested_leaf 20.6500μs 2.8559μs 350.1490 KOps/s 344.2107 KOps/s $\color{#35bf28}+1.73\%$
test_membership_nested_last 23.6700μs 5.2877μs 189.1199 KOps/s 187.4423 KOps/s $\color{#35bf28}+0.89\%$
test_membership_nested_leaf_last 22.4310μs 5.2701μs 189.7514 KOps/s 187.1375 KOps/s $\color{#35bf28}+1.40\%$
test_membership_stacked_nested_last 40.1010μs 12.7033μs 78.7196 KOps/s 79.1644 KOps/s $\color{#d91a1a}-0.56\%$
test_membership_stacked_nested_leaf_last 30.2410μs 12.5865μs 79.4502 KOps/s 78.5654 KOps/s $\color{#35bf28}+1.13\%$
test_nested_getleaf 23.7000μs 8.6144μs 116.0846 KOps/s 118.4875 KOps/s $\color{#d91a1a}-2.03\%$
test_nested_get 26.6110μs 8.1346μs 122.9322 KOps/s 125.4204 KOps/s $\color{#d91a1a}-1.98\%$
test_stacked_getleaf 24.8600μs 8.6350μs 115.8083 KOps/s 118.2180 KOps/s $\color{#d91a1a}-2.04\%$
test_stacked_get 28.7610μs 8.1503μs 122.6947 KOps/s 125.0066 KOps/s $\color{#d91a1a}-1.85\%$
test_nested_getitemleaf 32.4210μs 9.9559μs 100.4431 KOps/s 101.7536 KOps/s $\color{#d91a1a}-1.29\%$
test_nested_getitem 23.6300μs 9.4775μs 105.5134 KOps/s 106.3590 KOps/s $\color{#d91a1a}-0.80\%$
test_stacked_getitemleaf 33.5100μs 9.9731μs 100.2693 KOps/s 101.2669 KOps/s $\color{#d91a1a}-0.99\%$
test_stacked_getitem 26.3400μs 9.5587μs 104.6170 KOps/s 106.8602 KOps/s $\color{#d91a1a}-2.10\%$
test_lock_nested 2.0115ms 0.3625ms 2.7589 KOps/s 2.7701 KOps/s $\color{#d91a1a}-0.40\%$
test_lock_stack_nested 0.3746ms 0.3079ms 3.2483 KOps/s 3.2871 KOps/s $\color{#d91a1a}-1.18\%$
test_unlock_nested 0.7419ms 0.3611ms 2.7692 KOps/s 2.8031 KOps/s $\color{#d91a1a}-1.21\%$
test_unlock_stack_nested 0.3819ms 0.3177ms 3.1474 KOps/s 3.1735 KOps/s $\color{#d91a1a}-0.82\%$
test_flatten_speed 0.4775ms 0.2625ms 3.8102 KOps/s 3.8304 KOps/s $\color{#d91a1a}-0.53\%$
test_unflatten_speed 0.4270ms 0.3585ms 2.7896 KOps/s 2.7850 KOps/s $\color{#35bf28}+0.17\%$
test_common_ops 1.1209ms 0.6438ms 1.5533 KOps/s 1.6926 KOps/s $\textbf{\color{#d91a1a}-8.23\%}$
test_creation 14.6100μs 1.5385μs 649.9656 KOps/s 641.1243 KOps/s $\color{#35bf28}+1.38\%$
test_creation_empty 22.5600μs 9.2101μs 108.5768 KOps/s 135.9560 KOps/s $\textbf{\color{#d91a1a}-20.14\%}$
test_creation_nested_1 38.4900μs 11.0941μs 90.1379 KOps/s 111.3348 KOps/s $\textbf{\color{#d91a1a}-19.04\%}$
test_creation_nested_2 65.5910μs 13.6586μs 73.2139 KOps/s 87.7004 KOps/s $\textbf{\color{#d91a1a}-16.52\%}$
test_clone 65.7810μs 14.9710μs 66.7960 KOps/s 71.9174 KOps/s $\textbf{\color{#d91a1a}-7.12\%}$
test_getitem[int] 26.2400μs 10.9559μs 91.2747 KOps/s 90.8775 KOps/s $\color{#35bf28}+0.44\%$
test_getitem[slice_int] 39.7210μs 21.5345μs 46.4372 KOps/s 46.5072 KOps/s $\color{#d91a1a}-0.15\%$
test_getitem[range] 69.1210μs 51.1650μs 19.5446 KOps/s 19.4141 KOps/s $\color{#35bf28}+0.67\%$
test_getitem[tuple] 44.6110μs 19.1297μs 52.2747 KOps/s 53.2520 KOps/s $\color{#d91a1a}-1.84\%$
test_getitem[list] 0.1452ms 37.8250μs 26.4375 KOps/s 26.8116 KOps/s $\color{#d91a1a}-1.40\%$
test_setitem_dim[int] 47.2210μs 31.3420μs 31.9061 KOps/s 36.7552 KOps/s $\textbf{\color{#d91a1a}-13.19\%}$
test_setitem_dim[slice_int] 72.1620μs 52.4146μs 19.0787 KOps/s 20.6757 KOps/s $\textbf{\color{#d91a1a}-7.72\%}$
test_setitem_dim[range] 0.1136ms 73.9429μs 13.5239 KOps/s 14.5952 KOps/s $\textbf{\color{#d91a1a}-7.34\%}$
test_setitem_dim[tuple] 64.3420μs 45.3926μs 22.0300 KOps/s 24.6567 KOps/s $\textbf{\color{#d91a1a}-10.65\%}$
test_setitem 57.4310μs 20.5239μs 48.7236 KOps/s 54.9663 KOps/s $\textbf{\color{#d91a1a}-11.36\%}$
test_set 60.6910μs 19.6500μs 50.8906 KOps/s 55.3277 KOps/s $\textbf{\color{#d91a1a}-8.02\%}$
test_set_shared 0.1252s 0.1345ms 7.4365 KOps/s 9.6146 KOps/s $\textbf{\color{#d91a1a}-22.65\%}$
test_update 61.3510μs 22.6464μs 44.1572 KOps/s 50.1851 KOps/s $\textbf{\color{#d91a1a}-12.01\%}$
test_update_nested 87.7810μs 29.3345μs 34.0896 KOps/s 38.2057 KOps/s $\textbf{\color{#d91a1a}-10.77\%}$
test_set_nested 68.4110μs 21.2993μs 46.9498 KOps/s 53.4966 KOps/s $\textbf{\color{#d91a1a}-12.24\%}$
test_set_nested_new 64.1010μs 23.8011μs 42.0148 KOps/s 46.9895 KOps/s $\textbf{\color{#d91a1a}-10.59\%}$
test_select 80.5010μs 36.6708μs 27.2697 KOps/s 28.7067 KOps/s $\textbf{\color{#d91a1a}-5.01\%}$
test_select_nested 87.7710μs 54.0544μs 18.4999 KOps/s 18.9105 KOps/s $\color{#d91a1a}-2.17\%$
test_exclude_nested 0.6317ms 0.1169ms 8.5518 KOps/s 8.4554 KOps/s $\color{#35bf28}+1.14\%$
test_empty[True] 0.4684ms 0.3961ms 2.5244 KOps/s 2.5584 KOps/s $\color{#d91a1a}-1.33\%$
test_empty[False] 2.5481μs 0.8731μs 1.1454 MOps/s 1.1231 MOps/s $\color{#35bf28}+1.99\%$
test_to 77.6520μs 60.5912μs 16.5040 KOps/s 18.0190 KOps/s $\textbf{\color{#d91a1a}-8.41\%}$
test_to_nonblocking 64.4510μs 37.3201μs 26.7952 KOps/s 27.7225 KOps/s $\color{#d91a1a}-3.34\%$
test_unbind_speed 0.3281ms 0.2759ms 3.6248 KOps/s 3.7003 KOps/s $\color{#d91a1a}-2.04\%$
test_unbind_speed_stack0 0.3200ms 0.2670ms 3.7447 KOps/s 3.7862 KOps/s $\color{#d91a1a}-1.09\%$
test_unbind_speed_stack1 0.1265s 0.7681ms 1.3019 KOps/s 1.3111 KOps/s $\color{#d91a1a}-0.70\%$
test_split 1.6552ms 1.5563ms 642.5522 Ops/s 641.3605 Ops/s $\color{#35bf28}+0.19\%$
test_chunk 1.6527ms 1.5459ms 646.8650 Ops/s 641.3741 Ops/s $\color{#35bf28}+0.86\%$
test_creation[device0] 0.1319ms 74.5455μs 13.4146 KOps/s 13.6061 KOps/s $\color{#d91a1a}-1.41\%$
test_creation_from_tensor 0.1317ms 55.1116μs 18.1450 KOps/s 17.2929 KOps/s $\color{#35bf28}+4.93\%$
test_add_one[memmap_tensor0] 69.6310μs 7.6224μs 131.1927 KOps/s 140.2692 KOps/s $\textbf{\color{#d91a1a}-6.47\%}$
test_contiguous[memmap_tensor0] 25.2410μs 0.6533μs 1.5307 MOps/s 1.5263 MOps/s $\color{#35bf28}+0.29\%$
test_stack[memmap_tensor0] 29.1210μs 4.6946μs 213.0116 KOps/s 215.0761 KOps/s $\color{#d91a1a}-0.96\%$
test_memmaptd_index 1.0569ms 0.2712ms 3.6872 KOps/s 3.7805 KOps/s $\color{#d91a1a}-2.47\%$
test_memmaptd_index_astensor 0.5865ms 0.3285ms 3.0442 KOps/s 3.0938 KOps/s $\color{#d91a1a}-1.61\%$
test_memmaptd_index_op 0.9973ms 0.6603ms 1.5145 KOps/s 1.6549 KOps/s $\textbf{\color{#d91a1a}-8.49\%}$
test_serialize_model 0.2236s 0.1037s 9.6454 Ops/s 9.0441 Ops/s $\textbf{\color{#35bf28}+6.65\%}$
test_serialize_model_pickle 1.3492s 1.2356s 0.8093 Ops/s 0.8084 Ops/s $\color{#35bf28}+0.11\%$
test_serialize_weights 0.2197s 0.1015s 9.8552 Ops/s 10.8553 Ops/s $\textbf{\color{#d91a1a}-9.21\%}$
test_serialize_weights_returnearly 0.3573s 80.2534ms 12.4605 Ops/s 11.3166 Ops/s $\textbf{\color{#35bf28}+10.11\%}$
test_serialize_weights_pickle 1.3478s 1.2553s 0.7966 Ops/s 0.8093 Ops/s $\color{#d91a1a}-1.56\%$
test_reshape_pytree 0.1556ms 26.1151μs 38.2920 KOps/s 39.1960 KOps/s $\color{#d91a1a}-2.31\%$
test_reshape_td 55.9710μs 32.4259μs 30.8395 KOps/s 31.5306 KOps/s $\color{#d91a1a}-2.19\%$
test_view_pytree 59.9410μs 25.6026μs 39.0586 KOps/s 40.0114 KOps/s $\color{#d91a1a}-2.38\%$
test_view_td 0.1401s 59.7133μs 16.7467 KOps/s 21.3164 KOps/s $\textbf{\color{#d91a1a}-21.44\%}$
test_unbind_pytree 53.9510μs 31.1615μs 32.0909 KOps/s 32.5694 KOps/s $\color{#d91a1a}-1.47\%$
test_unbind_td 0.1153ms 41.2892μs 24.2194 KOps/s 24.6270 KOps/s $\color{#d91a1a}-1.66\%$
test_split_pytree 79.7410μs 29.8628μs 33.4865 KOps/s 34.8434 KOps/s $\color{#d91a1a}-3.89\%$
test_split_td 0.1097ms 40.0424μs 24.9735 KOps/s 25.1130 KOps/s $\color{#d91a1a}-0.56\%$
test_add_pytree 71.7410μs 37.9991μs 26.3164 KOps/s 27.9820 KOps/s $\textbf{\color{#d91a1a}-5.95\%}$
test_add_td 85.0610μs 57.6949μs 17.3326 KOps/s 19.7383 KOps/s $\textbf{\color{#d91a1a}-12.19\%}$
test_distributed 2.1785ms 82.2643μs 12.1559 KOps/s 11.0182 KOps/s $\textbf{\color{#35bf28}+10.33\%}$
test_tdmodule 82.4420μs 18.9151μs 52.8679 KOps/s 57.3429 KOps/s $\textbf{\color{#d91a1a}-7.80\%}$
test_tdmodule_dispatch 0.1538ms 38.7580μs 25.8011 KOps/s 27.5440 KOps/s $\textbf{\color{#d91a1a}-6.33\%}$
test_tdseq 44.4310μs 21.8095μs 45.8515 KOps/s 48.5931 KOps/s $\textbf{\color{#d91a1a}-5.64\%}$
test_tdseq_dispatch 58.0810μs 41.5137μs 24.0884 KOps/s 25.9578 KOps/s $\textbf{\color{#d91a1a}-7.20\%}$
test_instantiation_functorch 2.0741ms 1.6874ms 592.6363 Ops/s 589.6565 Ops/s $\color{#35bf28}+0.51\%$
test_instantiation_td 1.7042ms 1.1730ms 852.4796 Ops/s 854.9683 Ops/s $\color{#d91a1a}-0.29\%$
test_exec_functorch 0.2199ms 0.1647ms 6.0734 KOps/s 6.2025 KOps/s $\color{#d91a1a}-2.08\%$
test_exec_functional_call 0.2255ms 0.1645ms 6.0792 KOps/s 6.2023 KOps/s $\color{#d91a1a}-1.98\%$
test_exec_td 0.2012ms 0.1556ms 6.4261 KOps/s 6.5546 KOps/s $\color{#d91a1a}-1.96\%$
test_exec_td_decorator 0.3273ms 0.2034ms 4.9153 KOps/s 5.0968 KOps/s $\color{#d91a1a}-3.56\%$
test_vmap_mlp_speed[True-True] 0.7426ms 0.6255ms 1.5986 KOps/s 1.6269 KOps/s $\color{#d91a1a}-1.74\%$
test_vmap_mlp_speed[True-False] 0.6822ms 0.6218ms 1.6082 KOps/s 1.6328 KOps/s $\color{#d91a1a}-1.50\%$
test_vmap_mlp_speed[False-True] 0.6293ms 0.5472ms 1.8274 KOps/s 1.8193 KOps/s $\color{#35bf28}+0.45\%$
test_vmap_mlp_speed[False-False] 0.6214ms 0.5496ms 1.8194 KOps/s 1.8436 KOps/s $\color{#d91a1a}-1.31\%$
test_vmap_mlp_speed_decorator[True-True] 0.7459ms 0.6688ms 1.4953 KOps/s 1.5250 KOps/s $\color{#d91a1a}-1.94\%$
test_vmap_mlp_speed_decorator[True-False] 0.9387ms 0.6674ms 1.4982 KOps/s 1.5301 KOps/s $\color{#d91a1a}-2.08\%$
test_vmap_mlp_speed_decorator[False-True] 0.6792ms 0.5659ms 1.7672 KOps/s 1.7898 KOps/s $\color{#d91a1a}-1.26\%$
test_vmap_mlp_speed_decorator[False-False] 0.9002ms 0.5654ms 1.7685 KOps/s 1.7872 KOps/s $\color{#d91a1a}-1.04\%$
test_vmap_transformer_speed[True-True] 8.6791ms 8.3646ms 119.5509 Ops/s 121.1235 Ops/s $\color{#d91a1a}-1.30\%$
test_vmap_transformer_speed[True-False] 8.5335ms 8.3436ms 119.8525 Ops/s 121.6141 Ops/s $\color{#d91a1a}-1.45\%$
test_vmap_transformer_speed[False-True] 8.4196ms 8.2770ms 120.8167 Ops/s 122.0625 Ops/s $\color{#d91a1a}-1.02\%$
test_vmap_transformer_speed[False-False] 8.4901ms 8.2711ms 120.9034 Ops/s 122.4093 Ops/s $\color{#d91a1a}-1.23\%$
test_vmap_transformer_speed_decorator[True-True] 20.3616ms 19.9377ms 50.1561 Ops/s 50.7013 Ops/s $\color{#d91a1a}-1.08\%$
test_vmap_transformer_speed_decorator[True-False] 20.5618ms 19.9409ms 50.1481 Ops/s 50.8442 Ops/s $\color{#d91a1a}-1.37\%$
test_vmap_transformer_speed_decorator[False-True] 20.0106ms 19.5863ms 51.0561 Ops/s 51.6736 Ops/s $\color{#d91a1a}-1.20\%$
test_vmap_transformer_speed_decorator[False-False] 19.8766ms 19.4672ms 51.3684 Ops/s 51.7922 Ops/s $\color{#d91a1a}-0.82\%$
test_to_module_speed[True] 2.9275ms 1.2783ms 782.2791 Ops/s 787.8098 Ops/s $\color{#d91a1a}-0.70\%$
test_to_module_speed[False] 1.3711ms 1.2292ms 813.5366 Ops/s 799.0761 Ops/s $\color{#35bf28}+1.81\%$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants