Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Develop #73

Merged
merged 330 commits into from
Jun 11, 2022
Merged

Develop #73

merged 330 commits into from
Jun 11, 2022

Conversation

dongminlee94
Copy link
Owner

PR Description

Related Issues

Checklist

  • Assignees ran the make format.
  • Assignees ran the make lint command and the code passed the linters.
  • Assignees updated the README, if necessary.
  • Assignees checked the runnability of the code.
  • Reviewers checked the runnability of the code.

Optional section (e.g., code usage, experimental results, TODO)

dongminlee94 and others added 27 commits August 22, 2021 23:20
…_in_buffer

Bugfix/fix number of samples in buffer
…ondition

Feature/add early stopping condition
* test commit

* Create base structure

* add a high-level structure guide for development

* add a high-level structure guide for development

* add a high-level structure guide for development

* sync with pearl by dongmin

* Update MAML code

* Refactored network variable

* Bugfix: import error

* Refactored all MAML codes

* delete unused files

* add pyYAML to requirements

* add meta_train

* define the number of tasks at envs

* change a format of config files

* change directory of files in the  util folder

* change agent.train to agent.compute_losses to implement  MAML hessian structure

* add pylint related version requirement

* modify maml_trainer for yaml configs

* Match some formats with RL^2

* move maml folder into src folder

* add pytest PATH for MAML

* Feature/maml_exp_baseline (#57)

* Refactor buffers, meta_learner, and sampler modules in PEARL

* Refactor RL^2 code to avoid the bug of buffer

* image size test

* image size test

* image size test

* Fix image size

* Modify the name of PPO variables

* Add num_samples config, sampler log, and buffer log

* Remove num_sample_tasks config

* Add abs function to total_run_cost

* Add abs function to total_run_cost

* put the get_action method into the PPO.py as an staticmathod

* change hidden layer related codes and configuration

* add meta-test and logging features

* restore added codes for the assumed bug

* test commit

* test commit3

* add meta-test

* change defalt configurations of MAML

* Combine value function with policy as a set of meta-model

* meta-train and meta-test baseline

* Structure discussion

* Fix repeated tanh when infer actions from the TanhGaussianPolicy network

* Refactor buffer and sampler

* Add early stopping condition configs to PEARL config files

* Add early stopping condition configs to RL^2 config files

* Fix tanh bug to policy network in PEARL

* Add early stopping condition to meta-learner

* Fix the value to append to dq

* Change configs to what are used in the official repo of MAML

* Fix tanh bug to policy network in PEARL

* Add Linear-feature baseline

* Modify to compute advantage based on newly fitted baseline

* Add separated meta-update based on PPO algorithm

* Add early stopping condition configs to config files

* Update early stopping condition to meta learner

* Add list to range

* Add type annotation to all codes of PEARL

* Change dir name from assets to img

* Refactor PEARL codes

* Fix simple code

* Update README because of changing directory from assets to img

* Seperate train tasks and test tasks

* Set configuration based on references

* Delete linear-feature baseline and modify get_log_prob

* Remove static method feature from get_action and append None to log_probs to prevent buffer error

* Add a method into the buffer to update a value function before compute GAE

* Replace linear-feature baseline to value network and Add a variable to store old_policy

* Remove redundant code for obtaining adaptation samples and Modify a structure to follow the reference while keeping the log format

* Apply PR comment

* Utilize num_tasks

* Modify pylint statements

* Re-arrange the order of methods in the MetaLearner class

* Rename confused methods

* Remove old_policy and change variable & argument name for enhanced intuition

* Simplify log_values

* Seperate visualizing method

* Change argument name and add additional comments

* Modify conditional statements of the sampler

* Restore redundant commit of PEARL

* Utilize num_tasks while assigning goals as dictionary type

* Change argument name for logging

* Simplify saving  condition of log_prob

* Transpose compute_gae and compute_value to ppo.py

* Disjoin list compression

* Reflect 2nd Review comments of PR57

* Reflect 3rd review comments of PR57

* Remove numpy conversion from cuda tensor

* Add interoperability for CUDA

* Reflect 4th review comments of PR57

* Change inner-optimizer to Adam

* Change configs to match with those of the MAML paper

Co-authored-by: dongminlee94 <[email protected]>

Co-authored-by: dongminlee94 <[email protected]>
Co-authored-by: seunghyun lee <[email protected]>
* Remove unnecessary variable in envs

* Add checkpoint saving & loading to PEARL algorithm

* Fix log_prob issue to RL^2 algorithm
* replace ppo with trpo

* Add type-hint, saveing and loading, early stpping

* gaussian policy cuda runnability modification

* remove holdout test tasks and add test interval

* change the number of test tasks to be sampled

* combine train and test batchs in dir task

* modify test-batch of dir task to be  deterministic

* change dir task config

* restore heldout-test set

* avoid out-of-memory error by reducing the number of adapation

* modify early stop condition of vel task

* Resolve code reviewer's comments

* Refactoring deterministic condition line

* Resolve missed code reviewer's comments
* Change configurations of each algorithm

* Add saving modules

* Add type annotations
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@dongminlee94 dongminlee94 merged commit 8e95eef into main Jun 11, 2022
@dongminlee94 dongminlee94 deleted the develop branch June 11, 2022 17:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants