Download experts from hf inside tutorials and docs #766

jas-ho · 2023-08-04T12:42:56Z

Description

This updates the following files to download pretrained experts from the HuggingFace Model hub (instead of training them from scratch):

most notebooks in docs/tutorials
some algorithm doc files in docs/algorithms
the example script examples/quickstart.py

This saves computation time and allows users to experiment with competent, fully trained experts.

Note that this required changing the default example environment from "CartPole-v1" to "seals/CartPole-v0" in some places (since https://huggingface.co/HumanCompatibleAI only contains pretrained experts for the latter).

The only remaining calls to expert.train are now in

docs/tutorials/8_train_custom_env.ipynb where we train on a custom environment (hence, no pretrained expert available)
tests/data/test_rollout.py where experts are trained for a single timestep (hence, negligible overhead)

Solves #764

Testing

I have run all notebooks and the example script and checked that they still go through.

…rials

…rithms

docs/main-concepts/experts.rst

codecov · 2023-08-04T13:24:42Z

Codecov Report

Merging #766 (5bf9e13) into master (19c7f35) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master     #766   +/-   ##
=======================================
  Coverage   96.33%   96.33%           
=======================================
  Files          93       93           
  Lines        8789     8789           
=======================================
  Hits         8467     8467           
  Misses        322      322

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

AdamGleave

Thanks, these changes mostly looks good but there's a type error as the code is currently written.

The tutorials are typically passing in venv=env where env is a gym.Env created by gym.make. But load_policy expects a vec_env.VecEnv.

Clearly the notebooks work with this as they're passing tests, so one option to fix this would be to loosen up the type annotation on load_policy to a stable_baselines3.common.type_aliases.GymEnv = Union[gym.Env, vec_env.VecEnv] since this is what BaseAlgorithm.load accepts, although there's a chance this will cause type checking errors elsewhere (e.g. do any of our other policy loaders strictly require a vec_env.VecEnv)?

Alternatively we could stick to the vec_env.VecEnv convention we've mostly used in imitation (reasoning that VecEnv is strictly more general than a gym.Env, you can always have a VecEnv with a single gym.Env) and change your env's to a venv by e.g. using the util.make_vec_env function.

docs/algorithms/airl.rst

docs/algorithms/bc.rst

docs/algorithms/dagger.rst

docs/algorithms/gail.rst

docs/tutorials/2_train_dagger.ipynb

docs/tutorials/3_train_gail.ipynb

docs/tutorials/4_train_airl.ipynb

docs/tutorials/9_compare_baselines.ipynb

examples/quickstart.py

jas-ho · 2023-08-07T06:42:13Z

Thanks for spotting the inconsistent typing @AdamGleave .

jas-ho · 2023-08-07T06:57:35Z

one option to fix this would be to loosen up the type annotation on load_policy to a stable_baselines3.common.type_aliases.GymEnv = Union[gym.Env, vec_env.VecEnv] since this is what BaseAlgorithm.load accepts, although there's a chance this will cause type checking errors elsewhere (e.g. do any of our other policy loaders strictly require a vec_env.VecEnv)?

Neither mypy nor pytype complain if the annotation is changed to GymEnv. Looking over the loader factories in registry.py and serialize.py however shows that this is actually inconsistent with the type annotations in those loaders (which also expect VecEnv acc. to the type annotation).

This is because the typing of the policy registry is insufficient

agent_loader = policy_registry.get(policy_type)
reveal_type(agent_loader)

gives Revealed type is "def (*Any, **Any) -> stable_baselines3.common.policies.BasePolicy" where actually it should be def (vec_env.VecEnv, *Any, **Any) -> stable_baselines3.common.policies.BasePolicy

There is an open issue for that already: #574

jas-ho · 2023-08-07T06:59:27Z

Given how widespread the assumption of working with vec_env.VecEnv is atm the simpler solution you suggested seems safer to me:

stick to the vec_env.VecEnv convention we've mostly used in imitation (reasoning that VecEnv is strictly more general than a gym.Env, you can always have a VecEnv with a single gym.Env) and change your env's to a venv by e.g. using the util.make_vec_env function.

docs/tutorials/3_train_gail.ipynb

docs/algorithms/dagger.rst

jas-ho

@AdamGleave it's ready for a final review.

docs/tutorials/3_train_gail.ipynb

docs/algorithms/dagger.rst

AdamGleave

Thanks for the changes! Looks like we might be able to reduce code duplication by re-using some VecEnv objects; WDYT?

docs/algorithms/gail.rst

docs/algorithms/airl.rst

docs/tutorials/3_train_gail.ipynb

docs/tutorials/4_train_airl.ipynb

jas-ho · 2023-08-09T08:23:28Z

Thanks for the changes! Looks like we might be able to reduce code duplication by re-using some VecEnv objects; WDYT?

Good catch! I grepped and did not find other instances of redundant env creations in tutorials or docs. So should be good for final review and merge @AdamGleave

AdamGleave

LGTM

jas-ho added 3 commits August 4, 2023 14:16

Download expert policies from HF hub instead of training in docs/tuto…

fe68211

…rials

Download expert policies from HF hub instead of training in docs/algo…

c2adee1

…rithms

Download expert policies from HF hub by default in quickstart.py

0db0ed2

jas-ho linked an issue Aug 4, 2023 that may be closed by this pull request

Don't train experts in the tutorials, download from HF instead #764

Closed

jas-ho requested review from AdamGleave and ernestum August 4, 2023 12:43

jas-ho added 2 commits August 4, 2023 15:01

Fix broken import in 3_train_gail.ipynb

5586894

Fix broken call to load_policy in experts.rst docs

3d666a5

jas-ho commented Aug 4, 2023

View reviewed changes

docs/main-concepts/experts.rst Show resolved Hide resolved

jas-ho changed the title ~~764 dont train experts in the tutorials download from hf instead~~ Download experts from hf inside tutorials and docs Aug 4, 2023

AdamGleave reviewed Aug 5, 2023

View reviewed changes

jas-ho mentioned this pull request Aug 7, 2023

Ensure all tutorials work as expected #763

Closed

8 tasks

jas-ho added 5 commits August 7, 2023 10:30

Consistently use VecEnv environments in tutorial notebooks

dc848f8

Use VecEnv environments in quickstart.py

7c7b492

Suppress unused import warning for "seals" package in notebooks

a4d2cbd

Consistently use VecEnv environments in docs/algorithms

441d8df

Fix missing imports in some algorithm docs

533df99

jas-ho commented Aug 7, 2023

View reviewed changes

docs/tutorials/3_train_gail.ipynb Show resolved Hide resolved

jas-ho commented Aug 7, 2023

View reviewed changes

docs/algorithms/dagger.rst Outdated Show resolved Hide resolved

jas-ho requested a review from AdamGleave August 7, 2023 11:27

jas-ho added 5 commits August 8, 2023 11:10

Adapt hyperparameters in GAIL and AIRL notebooks and seed everywhere

c5600cf

Fix imports in 3_train_gail.ipynb

50e5970

Revert 9_compare_baselines.ipynb to (almost) the version on master

b9e07e9

Increase expert training steps in quickstart.py

7a38f0e

Adapt hyperparameters in GAIL and AIRL documentation and seed everywhere

5b9c147

jas-ho commented Aug 8, 2023

View reviewed changes

docs/tutorials/3_train_gail.ipynb Show resolved Hide resolved

docs/tutorials/3_train_gail.ipynb Show resolved Hide resolved

docs/tutorials/3_train_gail.ipynb Show resolved Hide resolved

docs/algorithms/dagger.rst Outdated Show resolved Hide resolved

AdamGleave reviewed Aug 9, 2023

View reviewed changes

docs/algorithms/gail.rst Outdated Show resolved Hide resolved

docs/algorithms/airl.rst Outdated Show resolved Hide resolved

docs/tutorials/3_train_gail.ipynb Outdated Show resolved Hide resolved

docs/tutorials/4_train_airl.ipynb Outdated Show resolved Hide resolved

jas-ho added 2 commits August 9, 2023 10:14

Reuse existing VecEnv in code examples in docs/algorithms

a1ebff8

Reuse existing VecEnv in tutorial notebooks for GAIL and AIRL

5bf9e13

jas-ho requested a review from AdamGleave August 9, 2023 08:23

AdamGleave approved these changes Aug 9, 2023

View reviewed changes

AdamGleave merged commit 60d8686 into master Aug 9, 2023
15 checks passed

AdamGleave deleted the 764-dont-train-experts-in-the-tutorials-download-from-hf-instead branch August 9, 2023 23:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Download experts from hf inside tutorials and docs #766

Download experts from hf inside tutorials and docs #766

jas-ho commented Aug 4, 2023 •

edited

Loading

codecov bot commented Aug 4, 2023 •

edited

Loading

AdamGleave left a comment

jas-ho commented Aug 7, 2023 •

edited

Loading

jas-ho commented Aug 7, 2023 •

edited

Loading

jas-ho commented Aug 7, 2023

jas-ho left a comment

AdamGleave left a comment

jas-ho commented Aug 9, 2023

AdamGleave left a comment

Download experts from hf inside tutorials and docs #766

Download experts from hf inside tutorials and docs #766

Conversation

jas-ho commented Aug 4, 2023 • edited Loading

Description

Testing

codecov bot commented Aug 4, 2023 • edited Loading

Codecov Report

AdamGleave left a comment

Choose a reason for hiding this comment

jas-ho commented Aug 7, 2023 • edited Loading

jas-ho commented Aug 7, 2023 • edited Loading

jas-ho commented Aug 7, 2023

jas-ho left a comment

Choose a reason for hiding this comment

AdamGleave left a comment

Choose a reason for hiding this comment

jas-ho commented Aug 9, 2023

AdamGleave left a comment

Choose a reason for hiding this comment

jas-ho commented Aug 4, 2023 •

edited

Loading

codecov bot commented Aug 4, 2023 •

edited

Loading

jas-ho commented Aug 7, 2023 •

edited

Loading

jas-ho commented Aug 7, 2023 •

edited

Loading