Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] Fix sliced PRB when only traj is provided #2228

Merged
merged 4 commits into from
Jun 14, 2024
Merged

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Jun 14, 2024

As noted in #2208, we register the cursor when computing the start/stop signals based on the "done" signals, but we don't do so when trajectories are provided. This PR solves this/

Copy link

pytorch-bot bot commented Jun 14, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2228

Note: Links to docs will display an error until the docs builds have been completed.

❌ 4 New Failures

As of commit 3b32f20 with merge base ce92e35 (image):

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 14, 2024
@vmoens vmoens linked an issue Jun 14, 2024 that may be closed by this pull request
3 tasks
@vmoens
Copy link
Contributor Author

vmoens commented Jun 14, 2024

@wertyuilife2 you can test these changes with this

import torch
from torchrl.data.replay_buffers import ReplayBuffer, LazyTensorStorage
from torchrl.data.replay_buffers.samplers import SliceSampler, PrioritizedSliceSampler
from tensordict import TensorDict


def test_sampler():
    torch.manual_seed(0)

    sampler = PrioritizedSliceSampler(
        max_capacity=10,
        num_slices=2,
        traj_key="trajectory",
        # end_key="done",
        strict_length=True,
        alpha=1.0,
        beta=1.0,
    )
    trajectory = torch.tensor([3, 0, 1, 1, 1, 2, 2, 2, 3, 3])
    done = torch.tensor([True, True, False, False, True, False, False, True, False, True])
    td = TensorDict({"trajectory": trajectory, "steps": torch.arange(10), "done": done}, [10])

    rb = ReplayBuffer(
        sampler=sampler,
        storage=LazyTensorStorage(10, device=torch.device("cpu")),
        batch_size=6,
    )

    rb.extend(td)
    for i in range(10):
        # preceding_stop_idx in sample(): [1 2 3 5 6 8 9]
        s, info = rb.sample(return_info=True)
        traj = s["trajectory"]
        print("[loop {}] sampled trajectory: {}".format(i, traj))
        print("[loop {}] index {}".format(i, info["index"]))
        assert len(traj.unique())<=2


test_sampler()

@vmoens vmoens added the bug Something isn't working label Jun 14, 2024
Copy link

github-actions bot commented Jun 14, 2024

$\color{#D29922}\textsf{\Large&amp;#x26A0;\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 91. Improved: $\large\color{#35bf28}4$. Worsened: $\large\color{#d91a1a}5$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_single 0.1214s 60.1703ms 16.6195 Ops/s 17.3589 Ops/s $\color{#d91a1a}-4.26\%$
test_sync 39.6126ms 31.0648ms 32.1908 Ops/s 29.8519 Ops/s $\textbf{\color{#35bf28}+7.84\%}$
test_async 59.3218ms 29.5056ms 33.8919 Ops/s 35.8444 Ops/s $\textbf{\color{#d91a1a}-5.45\%}$
test_simple 0.3832s 0.3818s 2.6189 Ops/s 2.5735 Ops/s $\color{#35bf28}+1.76\%$
test_transformed 0.5430s 0.5411s 1.8481 Ops/s 1.8154 Ops/s $\color{#35bf28}+1.80\%$
test_serial 1.3583s 1.2940s 0.7728 Ops/s 0.7673 Ops/s $\color{#35bf28}+0.72\%$
test_parallel 1.1462s 1.0983s 0.9105 Ops/s 0.9113 Ops/s $\color{#d91a1a}-0.09\%$
test_step_mdp_speed[True-True-True-True-True] 0.2733ms 21.4307μs 46.6620 KOps/s 47.1329 KOps/s $\color{#d91a1a}-1.00\%$
test_step_mdp_speed[True-True-True-True-False] 56.9750μs 13.0565μs 76.5901 KOps/s 77.9062 KOps/s $\color{#d91a1a}-1.69\%$
test_step_mdp_speed[True-True-True-False-True] 55.6440μs 12.7408μs 78.4881 KOps/s 79.5509 KOps/s $\color{#d91a1a}-1.34\%$
test_step_mdp_speed[True-True-True-False-False] 41.9290μs 7.7467μs 129.0875 KOps/s 129.1237 KOps/s $\color{#d91a1a}-0.03\%$
test_step_mdp_speed[True-True-False-True-True] 82.5240μs 22.7808μs 43.8965 KOps/s 44.5425 KOps/s $\color{#d91a1a}-1.45\%$
test_step_mdp_speed[True-True-False-True-False] 51.7070μs 14.3741μs 69.5696 KOps/s 71.0258 KOps/s $\color{#d91a1a}-2.05\%$
test_step_mdp_speed[True-True-False-False-True] 56.1460μs 13.9968μs 71.4449 KOps/s 72.2856 KOps/s $\color{#d91a1a}-1.16\%$
test_step_mdp_speed[True-True-False-False-False] 31.8990μs 8.9148μs 112.1729 KOps/s 113.8844 KOps/s $\color{#d91a1a}-1.50\%$
test_step_mdp_speed[True-False-True-True-True] 50.4850μs 24.2808μs 41.1848 KOps/s 41.8604 KOps/s $\color{#d91a1a}-1.61\%$
test_step_mdp_speed[True-False-True-True-False] 52.7980μs 15.5681μs 64.2341 KOps/s 65.1378 KOps/s $\color{#d91a1a}-1.39\%$
test_step_mdp_speed[True-False-True-False-True] 67.4960μs 13.8787μs 72.0531 KOps/s 72.8382 KOps/s $\color{#d91a1a}-1.08\%$
test_step_mdp_speed[True-False-True-False-False] 29.2140μs 8.9234μs 112.0655 KOps/s 112.8096 KOps/s $\color{#d91a1a}-0.66\%$
test_step_mdp_speed[True-False-False-True-True] 70.5920μs 25.2647μs 39.5809 KOps/s 37.2062 KOps/s $\textbf{\color{#35bf28}+6.38\%}$
test_step_mdp_speed[True-False-False-True-False] 59.5610μs 16.8319μs 59.4110 KOps/s 57.4500 KOps/s $\color{#35bf28}+3.41\%$
test_step_mdp_speed[True-False-False-False-True] 44.3030μs 15.2154μs 65.7229 KOps/s 67.4107 KOps/s $\color{#d91a1a}-2.50\%$
test_step_mdp_speed[True-False-False-False-False] 57.3360μs 10.1357μs 98.6610 KOps/s 100.7049 KOps/s $\color{#d91a1a}-2.03\%$
test_step_mdp_speed[False-True-True-True-True] 66.3240μs 24.2766μs 41.1919 KOps/s 42.4095 KOps/s $\color{#d91a1a}-2.87\%$
test_step_mdp_speed[False-True-True-True-False] 42.8710μs 15.8155μs 63.2291 KOps/s 65.2822 KOps/s $\color{#d91a1a}-3.15\%$
test_step_mdp_speed[False-True-True-False-True] 64.0300μs 16.1909μs 61.7631 KOps/s 62.0785 KOps/s $\color{#d91a1a}-0.51\%$
test_step_mdp_speed[False-True-True-False-False] 37.7700μs 10.2336μs 97.7170 KOps/s 99.4528 KOps/s $\color{#d91a1a}-1.75\%$
test_step_mdp_speed[False-True-False-True-True] 62.6380μs 25.1998μs 39.6828 KOps/s 39.8787 KOps/s $\color{#d91a1a}-0.49\%$
test_step_mdp_speed[False-True-False-True-False] 55.7160μs 16.8851μs 59.2240 KOps/s 60.3062 KOps/s $\color{#d91a1a}-1.79\%$
test_step_mdp_speed[False-True-False-False-True] 64.3390μs 17.2049μs 58.1230 KOps/s 57.8335 KOps/s $\color{#35bf28}+0.50\%$
test_step_mdp_speed[False-True-False-False-False] 36.8190μs 11.3013μs 88.4850 KOps/s 89.4542 KOps/s $\color{#d91a1a}-1.08\%$
test_step_mdp_speed[False-False-True-True-True] 75.2610μs 26.7418μs 37.3946 KOps/s 38.2959 KOps/s $\color{#d91a1a}-2.35\%$
test_step_mdp_speed[False-False-True-True-False] 50.2140μs 18.0340μs 55.4509 KOps/s 56.4028 KOps/s $\color{#d91a1a}-1.69\%$
test_step_mdp_speed[False-False-True-False-True] 62.8480μs 17.3412μs 57.6661 KOps/s 57.9856 KOps/s $\color{#d91a1a}-0.55\%$
test_step_mdp_speed[False-False-True-False-False] 50.5340μs 11.3665μs 87.9775 KOps/s 89.3968 KOps/s $\color{#d91a1a}-1.59\%$
test_step_mdp_speed[False-False-False-True-True] 39.6840μs 28.0075μs 35.7048 KOps/s 36.5903 KOps/s $\color{#d91a1a}-2.42\%$
test_step_mdp_speed[False-False-False-True-False] 78.4370μs 19.0344μs 52.5365 KOps/s 53.2302 KOps/s $\color{#d91a1a}-1.30\%$
test_step_mdp_speed[False-False-False-False-True] 43.6820μs 18.3166μs 54.5954 KOps/s 55.0786 KOps/s $\color{#d91a1a}-0.88\%$
test_step_mdp_speed[False-False-False-False-False] 69.7310μs 12.3979μs 80.6590 KOps/s 81.9575 KOps/s $\color{#d91a1a}-1.58\%$
test_values[generalized_advantage_estimate-True-True] 9.7801ms 9.5291ms 104.9418 Ops/s 102.4539 Ops/s $\color{#35bf28}+2.43\%$
test_values[vec_generalized_advantage_estimate-True-True] 36.4097ms 33.3641ms 29.9723 Ops/s 30.0245 Ops/s $\color{#d91a1a}-0.17\%$
test_values[td0_return_estimate-False-False] 0.2569ms 0.1863ms 5.3677 KOps/s 5.4358 KOps/s $\color{#d91a1a}-1.25\%$
test_values[td1_return_estimate-False-False] 26.8024ms 23.9430ms 41.7658 Ops/s 41.8903 Ops/s $\color{#d91a1a}-0.30\%$
test_values[vec_td1_return_estimate-False-False] 46.8845ms 33.9116ms 29.4884 Ops/s 30.0337 Ops/s $\color{#d91a1a}-1.82\%$
test_values[td_lambda_return_estimate-True-False] 37.2361ms 34.0836ms 29.3396 Ops/s 29.0249 Ops/s $\color{#35bf28}+1.08\%$
test_values[vec_td_lambda_return_estimate-True-False] 34.7714ms 33.4362ms 29.9077 Ops/s 29.9880 Ops/s $\color{#d91a1a}-0.27\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 8.4910ms 8.3291ms 120.0607 Ops/s 120.5390 Ops/s $\color{#d91a1a}-0.40\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.3474ms 1.9809ms 504.8132 Ops/s 533.3520 Ops/s $\textbf{\color{#d91a1a}-5.35\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.5541ms 0.3613ms 2.7679 KOps/s 2.7459 KOps/s $\color{#35bf28}+0.80\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 39.7421ms 38.8931ms 25.7115 Ops/s 22.5060 Ops/s $\textbf{\color{#35bf28}+14.24\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 3.7255ms 3.0690ms 325.8385 Ops/s 324.6579 Ops/s $\color{#35bf28}+0.36\%$
test_dqn_speed 6.6366ms 1.3689ms 730.5007 Ops/s 740.9467 Ops/s $\color{#d91a1a}-1.41\%$
test_ddpg_speed 3.6172ms 2.9151ms 343.0432 Ops/s 345.4041 Ops/s $\color{#d91a1a}-0.68\%$
test_sac_speed 9.3698ms 8.6501ms 115.6058 Ops/s 112.2146 Ops/s $\color{#35bf28}+3.02\%$
test_redq_speed 14.9834ms 13.5921ms 73.5719 Ops/s 72.2271 Ops/s $\color{#35bf28}+1.86\%$
test_redq_deprec_speed 15.1957ms 14.0639ms 71.1041 Ops/s 68.3921 Ops/s $\color{#35bf28}+3.97\%$
test_td3_speed 17.0084ms 8.6266ms 115.9201 Ops/s 115.5335 Ops/s $\color{#35bf28}+0.33\%$
test_cql_speed 38.7641ms 37.5071ms 26.6616 Ops/s 27.0348 Ops/s $\color{#d91a1a}-1.38\%$
test_a2c_speed 8.1469ms 7.7367ms 129.2549 Ops/s 129.3456 Ops/s $\color{#d91a1a}-0.07\%$
test_ppo_speed 9.3057ms 7.9441ms 125.8802 Ops/s 124.0545 Ops/s $\color{#35bf28}+1.47\%$
test_reinforce_speed 7.7909ms 7.0035ms 142.7863 Ops/s 147.1564 Ops/s $\color{#d91a1a}-2.97\%$
test_iql_speed 35.4054ms 33.7357ms 29.6421 Ops/s 29.5901 Ops/s $\color{#35bf28}+0.18\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.6637ms 3.7823ms 264.3887 Ops/s 271.6780 Ops/s $\color{#d91a1a}-2.68\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.9064ms 0.5138ms 1.9464 KOps/s 1.9562 KOps/s $\color{#d91a1a}-0.50\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7444ms 0.4873ms 2.0523 KOps/s 2.0608 KOps/s $\color{#d91a1a}-0.41\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 4.1686ms 3.7536ms 266.4115 Ops/s 267.1684 Ops/s $\color{#d91a1a}-0.28\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.9906ms 0.5082ms 1.9679 KOps/s 1.9777 KOps/s $\color{#d91a1a}-0.50\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6759ms 0.4763ms 2.0996 KOps/s 2.0819 KOps/s $\color{#35bf28}+0.85\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 2.3629ms 1.7604ms 568.0411 Ops/s 576.1651 Ops/s $\color{#d91a1a}-1.41\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 2.1521ms 1.6647ms 600.7131 Ops/s 608.2130 Ops/s $\color{#d91a1a}-1.23\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.5577ms 3.8168ms 261.9984 Ops/s 262.6189 Ops/s $\color{#d91a1a}-0.24\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.1145s 0.7329ms 1.3645 KOps/s 1.5788 KOps/s $\textbf{\color{#d91a1a}-13.57\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8566ms 0.6055ms 1.6516 KOps/s 1.6649 KOps/s $\color{#d91a1a}-0.80\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 3.9171ms 3.7161ms 269.0972 Ops/s 267.4291 Ops/s $\color{#35bf28}+0.62\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.0032ms 0.5179ms 1.9308 KOps/s 1.9602 KOps/s $\color{#d91a1a}-1.50\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6378ms 0.4919ms 2.0327 KOps/s 2.0835 KOps/s $\color{#d91a1a}-2.44\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 4.0866ms 3.6994ms 270.3138 Ops/s 274.2737 Ops/s $\color{#d91a1a}-1.44\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.6812ms 0.5132ms 1.9485 KOps/s 1.9646 KOps/s $\color{#d91a1a}-0.82\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 3.7976ms 0.4939ms 2.0247 KOps/s 2.0609 KOps/s $\color{#d91a1a}-1.76\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.0041ms 3.8775ms 257.9013 Ops/s 263.9663 Ops/s $\color{#d91a1a}-2.30\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.2192ms 0.6315ms 1.5834 KOps/s 1.5818 KOps/s $\color{#35bf28}+0.10\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7742ms 0.6063ms 1.6493 KOps/s 1.6675 KOps/s $\color{#d91a1a}-1.09\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.1289s 6.2492ms 160.0207 Ops/s 123.4765 Ops/s $\textbf{\color{#35bf28}+29.60\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 16.5438ms 12.6264ms 79.1990 Ops/s 79.3471 Ops/s $\color{#d91a1a}-0.19\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 1.0895ms 1.0439ms 957.9876 Ops/s 952.4178 Ops/s $\color{#35bf28}+0.58\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.1086s 7.9028ms 126.5377 Ops/s 169.8698 Ops/s $\textbf{\color{#d91a1a}-25.51\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 15.4336ms 12.6453ms 79.0808 Ops/s 79.5932 Ops/s $\color{#d91a1a}-0.64\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 3.9120ms 1.1137ms 897.9343 Ops/s 939.9233 Ops/s $\color{#d91a1a}-4.47\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.1095s 6.0116ms 166.3453 Ops/s 160.2534 Ops/s $\color{#35bf28}+3.80\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 0.1122s 14.7891ms 67.6175 Ops/s 78.4570 Ops/s $\textbf{\color{#d91a1a}-13.82\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 3.5547ms 1.2495ms 800.3129 Ops/s 796.2070 Ops/s $\color{#35bf28}+0.52\%$

Copy link

github-actions bot commented Jun 14, 2024

$\color{#35bf28}\textsf{\Large&amp;#x2714;\kern{0.2cm}\normalsize OK}$ Result of GPU Benchmark Tests

Total Benchmarks: 94. Improved: $\large\color{#35bf28}2$. Worsened: $\large\color{#d91a1a}0$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_single 0.1199s 0.1194s 8.3747 Ops/s 8.4426 Ops/s $\color{#d91a1a}-0.80\%$
test_sync 0.1032s 0.1017s 9.8347 Ops/s 9.7769 Ops/s $\color{#35bf28}+0.59\%$
test_async 0.2009s 0.1009s 9.9069 Ops/s 9.9781 Ops/s $\color{#d91a1a}-0.71\%$
test_single_pixels 0.1312s 0.1301s 7.6848 Ops/s 7.8406 Ops/s $\color{#d91a1a}-1.99\%$
test_sync_pixels 89.1839ms 84.2915ms 11.8636 Ops/s 11.9472 Ops/s $\color{#d91a1a}-0.70\%$
test_async_pixels 0.1623s 68.8892ms 14.5161 Ops/s 14.4501 Ops/s $\color{#35bf28}+0.46\%$
test_simple 0.8243s 0.8201s 1.2194 Ops/s 1.2213 Ops/s $\color{#d91a1a}-0.15\%$
test_transformed 1.1097s 1.0891s 0.9182 Ops/s 0.9291 Ops/s $\color{#d91a1a}-1.17\%$
test_serial 2.5894s 2.5451s 0.3929 Ops/s 0.4017 Ops/s $\color{#d91a1a}-2.19\%$
test_parallel 2.4453s 2.3773s 0.4207 Ops/s 0.4266 Ops/s $\color{#d91a1a}-1.39\%$
test_step_mdp_speed[True-True-True-True-True] 0.1616ms 33.5922μs 29.7688 KOps/s 29.0237 KOps/s $\color{#35bf28}+2.57\%$
test_step_mdp_speed[True-True-True-True-False] 0.1748ms 20.3224μs 49.2068 KOps/s 49.1555 KOps/s $\color{#35bf28}+0.10\%$
test_step_mdp_speed[True-True-True-False-True] 0.1068ms 19.6354μs 50.9283 KOps/s 50.9592 KOps/s $\color{#d91a1a}-0.06\%$
test_step_mdp_speed[True-True-True-False-False] 32.1410μs 11.5546μs 86.5457 KOps/s 84.9201 KOps/s $\color{#35bf28}+1.91\%$
test_step_mdp_speed[True-True-False-True-True] 66.3520μs 35.4947μs 28.1733 KOps/s 27.6624 KOps/s $\color{#35bf28}+1.85\%$
test_step_mdp_speed[True-True-False-True-False] 0.1205ms 22.1964μs 45.0523 KOps/s 44.8868 KOps/s $\color{#35bf28}+0.37\%$
test_step_mdp_speed[True-True-False-False-True] 40.0010μs 20.9495μs 47.7339 KOps/s 45.8065 KOps/s $\color{#35bf28}+4.21\%$
test_step_mdp_speed[True-True-False-False-False] 32.8110μs 13.5915μs 73.5755 KOps/s 73.6018 KOps/s $\color{#d91a1a}-0.04\%$
test_step_mdp_speed[True-False-True-True-True] 64.2920μs 38.1674μs 26.2004 KOps/s 26.5415 KOps/s $\color{#d91a1a}-1.29\%$
test_step_mdp_speed[True-False-True-True-False] 48.9110μs 24.1265μs 41.4483 KOps/s 41.1267 KOps/s $\color{#35bf28}+0.78\%$
test_step_mdp_speed[True-False-True-False-True] 39.1410μs 21.0832μs 47.4311 KOps/s 47.4089 KOps/s $\color{#35bf28}+0.05\%$
test_step_mdp_speed[True-False-True-False-False] 32.1710μs 13.5326μs 73.8954 KOps/s 74.2783 KOps/s $\color{#d91a1a}-0.52\%$
test_step_mdp_speed[True-False-False-True-True] 62.2620μs 39.2938μs 25.4493 KOps/s 25.2311 KOps/s $\color{#35bf28}+0.86\%$
test_step_mdp_speed[True-False-False-True-False] 0.2151ms 25.6889μs 38.9274 KOps/s 38.0998 KOps/s $\color{#35bf28}+2.17\%$
test_step_mdp_speed[True-False-False-False-True] 0.2199ms 22.5629μs 44.3205 KOps/s 43.1382 KOps/s $\color{#35bf28}+2.74\%$
test_step_mdp_speed[True-False-False-False-False] 40.6610μs 15.2147μs 65.7260 KOps/s 64.6233 KOps/s $\color{#35bf28}+1.71\%$
test_step_mdp_speed[False-True-True-True-True] 0.2307ms 37.2672μs 26.8333 KOps/s 26.2911 KOps/s $\color{#35bf28}+2.06\%$
test_step_mdp_speed[False-True-True-True-False] 46.0710μs 23.9244μs 41.7983 KOps/s 41.0972 KOps/s $\color{#35bf28}+1.71\%$
test_step_mdp_speed[False-True-True-False-True] 53.7410μs 25.1939μs 39.6922 KOps/s 39.1970 KOps/s $\color{#35bf28}+1.26\%$
test_step_mdp_speed[False-True-True-False-False] 34.8510μs 15.2108μs 65.7428 KOps/s 65.5207 KOps/s $\color{#35bf28}+0.34\%$
test_step_mdp_speed[False-True-False-True-True] 70.7120μs 39.3605μs 25.4062 KOps/s 24.9504 KOps/s $\color{#35bf28}+1.83\%$
test_step_mdp_speed[False-True-False-True-False] 43.6810μs 25.8171μs 38.7341 KOps/s 38.6503 KOps/s $\color{#35bf28}+0.22\%$
test_step_mdp_speed[False-True-False-False-True] 80.9310μs 26.8404μs 37.2573 KOps/s 36.6742 KOps/s $\color{#35bf28}+1.59\%$
test_step_mdp_speed[False-True-False-False-False] 72.7110μs 17.4167μs 57.4163 KOps/s 57.4789 KOps/s $\color{#d91a1a}-0.11\%$
test_step_mdp_speed[False-False-True-True-True] 71.1120μs 41.4778μs 24.1093 KOps/s 23.7003 KOps/s $\color{#35bf28}+1.73\%$
test_step_mdp_speed[False-False-True-True-False] 63.7110μs 28.2499μs 35.3983 KOps/s 35.7769 KOps/s $\color{#d91a1a}-1.06\%$
test_step_mdp_speed[False-False-True-False-True] 44.6910μs 27.4482μs 36.4322 KOps/s 36.6104 KOps/s $\color{#d91a1a}-0.49\%$
test_step_mdp_speed[False-False-True-False-False] 72.1320μs 17.0372μs 58.6952 KOps/s 57.4188 KOps/s $\color{#35bf28}+2.22\%$
test_step_mdp_speed[False-False-False-True-True] 73.0720μs 43.8704μs 22.7944 KOps/s 22.1527 KOps/s $\color{#35bf28}+2.90\%$
test_step_mdp_speed[False-False-False-True-False] 54.4920μs 30.0898μs 33.2338 KOps/s 32.9082 KOps/s $\color{#35bf28}+0.99\%$
test_step_mdp_speed[False-False-False-False-True] 0.1453ms 28.7833μs 34.7424 KOps/s 34.2274 KOps/s $\color{#35bf28}+1.50\%$
test_step_mdp_speed[False-False-False-False-False] 46.7310μs 18.9630μs 52.7344 KOps/s 52.2782 KOps/s $\color{#35bf28}+0.87\%$
test_values[generalized_advantage_estimate-True-True] 27.0942ms 25.4240ms 39.3329 Ops/s 40.0109 Ops/s $\color{#d91a1a}-1.69\%$
test_values[vec_generalized_advantage_estimate-True-True] 88.2595ms 2.6923ms 371.4295 Ops/s 375.1558 Ops/s $\color{#d91a1a}-0.99\%$
test_values[td0_return_estimate-False-False] 95.3820μs 67.8657μs 14.7350 KOps/s 15.1084 KOps/s $\color{#d91a1a}-2.47\%$
test_values[td1_return_estimate-False-False] 60.4216ms 57.8869ms 17.2751 Ops/s 17.8175 Ops/s $\color{#d91a1a}-3.04\%$
test_values[vec_td1_return_estimate-False-False] 1.3765ms 1.1008ms 908.3944 Ops/s 917.2321 Ops/s $\color{#d91a1a}-0.96\%$
test_values[td_lambda_return_estimate-True-False] 96.0674ms 92.3443ms 10.8290 Ops/s 11.2457 Ops/s $\color{#d91a1a}-3.71\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.4220ms 1.0960ms 912.4062 Ops/s 920.4280 Ops/s $\color{#d91a1a}-0.87\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 27.2022ms 25.7919ms 38.7719 Ops/s 39.6124 Ops/s $\color{#d91a1a}-2.12\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 0.9634ms 0.7291ms 1.3716 KOps/s 1.3725 KOps/s $\color{#d91a1a}-0.07\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.8636ms 0.6927ms 1.4436 KOps/s 1.4894 KOps/s $\color{#d91a1a}-3.07\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.6382ms 1.4778ms 676.6649 Ops/s 677.7453 Ops/s $\color{#d91a1a}-0.16\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.8385ms 0.6915ms 1.4462 KOps/s 1.4605 KOps/s $\color{#d91a1a}-0.98\%$
test_dqn_speed 1.9024ms 1.5295ms 653.8219 Ops/s 658.0205 Ops/s $\color{#d91a1a}-0.64\%$
test_ddpg_speed 3.4315ms 3.0778ms 324.9111 Ops/s 332.3418 Ops/s $\color{#d91a1a}-2.24\%$
test_sac_speed 8.8642ms 8.6333ms 115.8305 Ops/s 117.6018 Ops/s $\color{#d91a1a}-1.51\%$
test_redq_speed 11.4861ms 10.8462ms 92.1984 Ops/s 84.2957 Ops/s $\textbf{\color{#35bf28}+9.37\%}$
test_redq_deprec_speed 12.1892ms 11.7518ms 85.0932 Ops/s 86.6408 Ops/s $\color{#d91a1a}-1.79\%$
test_td3_speed 17.5170ms 8.6315ms 115.8541 Ops/s 118.7824 Ops/s $\color{#d91a1a}-2.47\%$
test_cql_speed 27.6040ms 26.5243ms 37.7013 Ops/s 38.2478 Ops/s $\color{#d91a1a}-1.43\%$
test_a2c_speed 6.1273ms 5.8559ms 170.7665 Ops/s 174.4293 Ops/s $\color{#d91a1a}-2.10\%$
test_ppo_speed 6.4619ms 6.1800ms 161.8119 Ops/s 163.6054 Ops/s $\color{#d91a1a}-1.10\%$
test_reinforce_speed 5.0938ms 4.7762ms 209.3709 Ops/s 214.4063 Ops/s $\color{#d91a1a}-2.35\%$
test_iql_speed 21.1956ms 20.6385ms 48.4532 Ops/s 49.6213 Ops/s $\color{#d91a1a}-2.35\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 4.9397ms 4.7513ms 210.4695 Ops/s 216.7968 Ops/s $\color{#d91a1a}-2.92\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.8415ms 0.6081ms 1.6446 KOps/s 1.6756 KOps/s $\color{#d91a1a}-1.85\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 4.5619ms 0.5866ms 1.7047 KOps/s 1.7257 KOps/s $\color{#d91a1a}-1.22\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.0618ms 4.7684ms 209.7143 Ops/s 216.6410 Ops/s $\color{#d91a1a}-3.20\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.7788ms 0.6000ms 1.6668 KOps/s 1.6913 KOps/s $\color{#d91a1a}-1.45\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 4.4383ms 0.5799ms 1.7243 KOps/s 1.7487 KOps/s $\color{#d91a1a}-1.39\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 2.3729ms 2.1674ms 461.3758 Ops/s 472.6122 Ops/s $\color{#d91a1a}-2.38\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 5.8075ms 2.0815ms 480.4323 Ops/s 496.2096 Ops/s $\color{#d91a1a}-3.18\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.0888ms 4.8568ms 205.8980 Ops/s 210.2514 Ops/s $\color{#d91a1a}-2.07\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.6942ms 0.7341ms 1.3622 KOps/s 1.1632 KOps/s $\textbf{\color{#35bf28}+17.11\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.9139ms 0.7091ms 1.4103 KOps/s 1.4227 KOps/s $\color{#d91a1a}-0.87\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 4.9735ms 4.8105ms 207.8773 Ops/s 215.6630 Ops/s $\color{#d91a1a}-3.61\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.4477ms 0.6066ms 1.6485 KOps/s 1.6617 KOps/s $\color{#d91a1a}-0.80\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.8038ms 0.5834ms 1.7141 KOps/s 1.7298 KOps/s $\color{#d91a1a}-0.91\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.0518ms 4.7667ms 209.7906 Ops/s 216.7892 Ops/s $\color{#d91a1a}-3.23\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.7889ms 0.5975ms 1.6737 KOps/s 1.6675 KOps/s $\color{#35bf28}+0.37\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 4.5936ms 0.5781ms 1.7297 KOps/s 1.7175 KOps/s $\color{#35bf28}+0.72\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.0581ms 4.8699ms 205.3430 Ops/s 208.3886 Ops/s $\color{#d91a1a}-1.46\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.6304ms 0.7378ms 1.3554 KOps/s 1.3576 KOps/s $\color{#d91a1a}-0.17\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.9118ms 0.7115ms 1.4055 KOps/s 1.4143 KOps/s $\color{#d91a1a}-0.62\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.1259s 7.4047ms 135.0490 Ops/s 130.2830 Ops/s $\color{#35bf28}+3.66\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 18.5542ms 15.9468ms 62.7086 Ops/s 63.2119 Ops/s $\color{#d91a1a}-0.80\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 1.4602ms 1.3092ms 763.8312 Ops/s 768.8196 Ops/s $\color{#d91a1a}-0.65\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.1196s 9.5577ms 104.6276 Ops/s 103.9462 Ops/s $\color{#35bf28}+0.66\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 18.2237ms 15.8541ms 63.0752 Ops/s 63.7838 Ops/s $\color{#d91a1a}-1.11\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 2.3772ms 1.3259ms 754.2012 Ops/s 753.3219 Ops/s $\color{#35bf28}+0.12\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.1189s 7.4314ms 134.5643 Ops/s 132.8742 Ops/s $\color{#35bf28}+1.27\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 18.6546ms 16.1512ms 61.9150 Ops/s 62.9226 Ops/s $\color{#d91a1a}-1.60\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 2.5901ms 1.5427ms 648.2006 Ops/s 643.6004 Ops/s $\color{#35bf28}+0.71\%$

@vmoens vmoens merged commit 35df59e into main Jun 14, 2024
54 of 58 checks passed
@vmoens vmoens deleted the fix-prb-traj branch June 14, 2024 14:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Unintended Cross-Trajectory Sampling in PrioritizedSliceSampler.sample()
2 participants