Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation of the SQIL algorithm #744

Merged
merged 60 commits into from
Aug 10, 2023
Merged
Changes from 1 commit
Commits
Show all changes
60 commits
Select commit Hold shift + click to select a range
1935c99
Initial version of the SQIL implementation
RedTachyon Jul 4, 2023
2d4151e
Pin SB3 version to 1.7.0 (#738) (#745)
RedTachyon Jul 4, 2023
993a0d7
Another redundant type warning
RedTachyon Jul 4, 2023
899a5d8
Correctly set the expert rewards to 1
RedTachyon Jul 5, 2023
73064ac
Update typing, add some tests
RedTachyon Jul 6, 2023
b6c9d26
Update sqil.py
RedTachyon Jul 6, 2023
42d5468
Style fixes
RedTachyon Jul 6, 2023
86825d8
Test updates
RedTachyon Jul 6, 2023
95a2661
Add a test to check the buffer
RedTachyon Jul 6, 2023
67662b4
Formatting, docstring
RedTachyon Jul 6, 2023
68f693b
Improve test coverage
RedTachyon Jul 6, 2023
c4b0521
Update branch to master (#749)
RedTachyon Jul 6, 2023
1b5338b
Some documentation updates (not complete)
RedTachyon Jul 6, 2023
3c78336
Add a SQIL tutorial
RedTachyon Jul 6, 2023
c303af1
Reduce tutorial runtime
RedTachyon Jul 6, 2023
bf81940
Add SQIL description in docs, try to add it to the right places
RedTachyon Jul 6, 2023
0f95524
Merge branch 'master' into redtachyon/740-sqil
RedTachyon Jul 6, 2023
5da56f3
Fix docs
RedTachyon Jul 6, 2023
e410c39
Merge remote-tracking branch 'HCAI/redtachyon/740-sqil' into redtachy…
RedTachyon Jul 6, 2023
d8f3c30
Blacken a tutorial
RedTachyon Jul 6, 2023
ae43a75
Reorder things in docs
RedTachyon Jul 7, 2023
5b23f84
Change the SQIL structure to instead subclass the replay buffer, new …
RedTachyon Jul 7, 2023
bc8152b
Add an empty line
RedTachyon Jul 7, 2023
7d56e6a
Simplify the arguments
RedTachyon Jul 7, 2023
4e3f156
Cover another edge case, another test, fixes
RedTachyon Jul 7, 2023
d018cbd
Fix a circular import issue
RedTachyon Jul 7, 2023
29cdbfa
Add a performance test - might be slow?
RedTachyon Jul 7, 2023
551fa7e
Fix coverage
RedTachyon Jul 7, 2023
fcd94b9
Improve input validation
AdamGleave Jul 8, 2023
34ddf82
Bugfix: have set_demonstrations set rather than return
AdamGleave Jul 8, 2023
cf20fbb
Move TransitionMapping from algorithms.base to data.types
AdamGleave Jul 8, 2023
ee16818
Fix typo: expert_buffer->self.expert_buffer
AdamGleave Jul 8, 2023
87876aa
Bugfix: use safe_to_numpy rather than assuming th.Tensor
AdamGleave Jul 8, 2023
12e30b1
Fix lint
AdamGleave Jul 8, 2023
90a3a79
Fix unused imports
AdamGleave Jul 8, 2023
ef0fd26
Refactor tests
AdamGleave Jul 8, 2023
34241b2
Bump # of rollouts to try to fix MacOS flakiness
AdamGleave Jul 9, 2023
ed399d3
Merge branch 'master' into redtachyon/740-sqil
ernestum Jul 18, 2023
c8e9df8
Simplify SQIL example and tutorial by 1. downloading expert trajector…
ernestum Jul 18, 2023
e4e5d9f
Improve docstring of SQILReplayBuffer.
ernestum Jul 18, 2023
b89e5d8
Set the expert_buffer in the constructor.
ernestum Jul 18, 2023
c7723e5
Consistently set expert transition reward to 1 and learner transition…
ernestum Jul 18, 2023
e0bc16d
Fix docstring of SQILReplayBuffer.sample()
ernestum Jul 18, 2023
203c89f
Switch back to the CartPole-v1 environment in the SQIL examples
ernestum Jul 18, 2023
c149385
Only train for 1k steps in the SQIL example so the doctests don't run…
ernestum Jul 18, 2023
18a6622
Fix cell metadata for tutorial notebook.
ernestum Jul 18, 2023
9c5b91c
Notebook formatting fixes.
ernestum Jul 18, 2023
f8584c3
Fix typing error in SQIL implementation.
ernestum Jul 18, 2023
02f3191
Fix isort issue.
ernestum Jul 18, 2023
649de46
Clarify that our variant of the SQIL implementation is not really "so…
ernestum Jul 19, 2023
c72b088
Fix link in experts documentation.
ernestum Jul 19, 2023
8277a5c
Remove support for transition mappings.
ernestum Jul 19, 2023
a0af5c5
Remove data_loader from SQIL test cases.
ernestum Jul 20, 2023
4ccea30
Bump number of demonstrations in SQIL performance test to reduce flak…
ernestum Jul 21, 2023
68cbce8
Adapt hyperparameters in test_sqil_performance to reduce flakiness
jas-ho Aug 8, 2023
2bf467d
Fix seeds for flaky test_sqil_performance
jas-ho Aug 8, 2023
ccda686
Increase coverage in test_sqil.py
jas-ho Aug 8, 2023
91b226a
Pass kwargs to SQIL.train to DQN.learn
jas-ho Aug 9, 2023
5cbb6b2
Pass parameters as kwargs for multi-ary methods in sqil.py
jas-ho Aug 9, 2023
d2124a2
Make test for exceptions raised by SQIL constructor more specific
jas-ho Aug 9, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Bump # of rollouts to try to fix MacOS flakiness
  • Loading branch information
AdamGleave committed Jul 9, 2023
commit 34241b2aabe78b961abc309024c6021c293d2266
4 changes: 2 additions & 2 deletions tests/algorithms/test_sqil.py
Original file line number Diff line number Diff line change
Expand Up @@ -111,7 +111,7 @@ def test_sqil_performance(
rewards_before, _ = evaluate_policy(
model.policy,
cartpole_venv,
10,
20,
return_episode_rewards=True,
)

Expand All @@ -120,7 +120,7 @@ def test_sqil_performance(
rewards_after, _ = evaluate_policy(
model.policy,
cartpole_venv,
10,
20,
return_episode_rewards=True,
)

Expand Down
Loading