Develop #73

dongminlee94 · 2022-06-11T17:00:09Z

PR Description

Related Issues

Checklist

Assignees ran the make format.
Assignees ran the make lint command and the code passed the linters.
Assignees updated the README, if necessary.
Assignees checked the runnability of the code.
Reviewers checked the runnability of the code.

Optional section (e.g., code usage, experimental results, TODO)

…_in_buffer Bugfix/fix number of samples in buffer

…ondition Feature/add early stopping condition

Feature/remove torch utils

* test commit * Create base structure * add a high-level structure guide for development * add a high-level structure guide for development * add a high-level structure guide for development * sync with pearl by dongmin * Update MAML code * Refactored network variable * Bugfix: import error * Refactored all MAML codes * delete unused files * add pyYAML to requirements * add meta_train * define the number of tasks at envs * change a format of config files * change directory of files in the util folder * change agent.train to agent.compute_losses to implement MAML hessian structure * add pylint related version requirement * modify maml_trainer for yaml configs * Match some formats with RL^2 * move maml folder into src folder * add pytest PATH for MAML * Feature/maml_exp_baseline (#57) * Refactor buffers, meta_learner, and sampler modules in PEARL * Refactor RL^2 code to avoid the bug of buffer * image size test * image size test * image size test * Fix image size * Modify the name of PPO variables * Add num_samples config, sampler log, and buffer log * Remove num_sample_tasks config * Add abs function to total_run_cost * Add abs function to total_run_cost * put the get_action method into the PPO.py as an staticmathod * change hidden layer related codes and configuration * add meta-test and logging features * restore added codes for the assumed bug * test commit * test commit3 * add meta-test * change defalt configurations of MAML * Combine value function with policy as a set of meta-model * meta-train and meta-test baseline * Structure discussion * Fix repeated tanh when infer actions from the TanhGaussianPolicy network * Refactor buffer and sampler * Add early stopping condition configs to PEARL config files * Add early stopping condition configs to RL^2 config files * Fix tanh bug to policy network in PEARL * Add early stopping condition to meta-learner * Fix the value to append to dq * Change configs to what are used in the official repo of MAML * Fix tanh bug to policy network in PEARL * Add Linear-feature baseline * Modify to compute advantage based on newly fitted baseline * Add separated meta-update based on PPO algorithm * Add early stopping condition configs to config files * Update early stopping condition to meta learner * Add list to range * Add type annotation to all codes of PEARL * Change dir name from assets to img * Refactor PEARL codes * Fix simple code * Update README because of changing directory from assets to img * Seperate train tasks and test tasks * Set configuration based on references * Delete linear-feature baseline and modify get_log_prob * Remove static method feature from get_action and append None to log_probs to prevent buffer error * Add a method into the buffer to update a value function before compute GAE * Replace linear-feature baseline to value network and Add a variable to store old_policy * Remove redundant code for obtaining adaptation samples and Modify a structure to follow the reference while keeping the log format * Apply PR comment * Utilize num_tasks * Modify pylint statements * Re-arrange the order of methods in the MetaLearner class * Rename confused methods * Remove old_policy and change variable & argument name for enhanced intuition * Simplify log_values * Seperate visualizing method * Change argument name and add additional comments * Modify conditional statements of the sampler * Restore redundant commit of PEARL * Utilize num_tasks while assigning goals as dictionary type * Change argument name for logging * Simplify saving condition of log_prob * Transpose compute_gae and compute_value to ppo.py * Disjoin list compression * Reflect 2nd Review comments of PR57 * Reflect 3rd review comments of PR57 * Remove numpy conversion from cuda tensor * Add interoperability for CUDA * Reflect 4th review comments of PR57 * Change inner-optimizer to Adam * Change configs to match with those of the MAML paper Co-authored-by: dongminlee94 <[email protected]> Co-authored-by: dongminlee94 <[email protected]> Co-authored-by: seunghyun lee <[email protected]>

* Remove unnecessary variable in envs * Add checkpoint saving & loading to PEARL algorithm * Fix log_prob issue to RL^2 algorithm

* replace ppo with trpo * Add type-hint, saveing and loading, early stpping * gaussian policy cuda runnability modification * remove holdout test tasks and add test interval * change the number of test tasks to be sampled * combine train and test batchs in dir task * modify test-batch of dir task to be deterministic * change dir task config * restore heldout-test set * avoid out-of-memory error by reducing the number of adapation * modify early stop condition of vel task * Resolve code reviewer's comments * Refactoring deterministic condition line * Resolve missed code reviewer's comments

* Change configurations of each algorithm * Add saving modules * Add type annotations

review-notebook-app · 2022-06-11T17:00:13Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

dongminlee94 added 30 commits May 6, 2021 23:30

Refactored train model function

7148499

Refactored train model function

db72374

Refactored train model function

4b4db5f

Refactored train model function

9736fde

Refactored train model function

910c62e

Refactored train model function

1e32531

Refactored train model function

2255310

Update train model function

70f8f90

Update train model function

7a9c088

Update train model function

5144d64

Update train model function

d4b4049

Update train model function

97d34c0

Update train model function

f5b2274

Update train model function

369ac8c

Update train model function

1b244c4

Update train model function

5f9fb8d

Update train model function

37284c5

Update train model function

5ec6113

Update train model function

90faea0

Update train model function

66cc907

Update train model function

9ae42d4

Update train model function

0f0b7ee

Update train model function

1849af1

Update train model function

7439d4a

Update train model function

cc83647

Update train model function

8e03975

Update train model function

b71e4a9

Update train model function

f92e99b

Update train model function

07ddb65

Update train model function

09d2a0e

dongminlee94 and others added 27 commits August 22, 2021 23:20

Add abs function to total_run_cost

cf31fc9

Add abs function to total_run_cost

421379c

Refactor buffer and sampler

b7ccf7d

Merge pull request #40 from dongminlee94/bugfix/fix_number_of_samples…

a619be6

…_in_buffer Bugfix/fix number of samples in buffer

Add early stopping condition configs to PEARL config files

d43d4a7

Add early stopping condition configs to RL^2 config files

04aac11

Fix tanh bug to policy network in PEARL

9e07dd3

Add early stopping condition to meta-learner

8ac4a4f

Fix the value to append to dq

bb58d62

Add early stopping condition configs to config files

f2cbd2e

Update early stopping condition to meta learner

036554b

Merge pull request #49 from dongminlee94/feature/add_early_stopping_c…

cdd9c4a

…ondition Feature/add early stopping condition

Add list to range

a7794d4

Add type annotation to all codes of PEARL

73fdec8

Change dir name from assets to img

023f767

Refactor PEARL codes

f0b19d2

Fix simple code

2d2e743

Update README because of changing directory from assets to img

20f0638

Apply PR comment

d824916

Merge pull request #55 from dongminlee94/feature/remove_torch_utils

a4675ae

Feature/remove torch utils

Change env from pybullet to mujoco (#61)

86f61bb

Feature/checkpoint saving and loading (#63)

32c59b3

* Remove unnecessary variable in envs * Add checkpoint saving & loading to PEARL algorithm * Fix log_prob issue to RL^2 algorithm

Update PEARL configs (#65)

fa5024a

Feature/refactor rl2 (#71)

e365216

* Change configurations of each algorithm * Add saving modules * Add type annotations

add codes for meta supervised learning (#72)

5c00f3e

dongminlee94 merged commit 8e95eef into main Jun 11, 2022

dongminlee94 deleted the develop branch June 11, 2022 17:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Develop #73

Develop #73

dongminlee94 commented Jun 11, 2022

review-notebook-app bot commented Jun 11, 2022

Develop #73

Develop #73

Conversation

dongminlee94 commented Jun 11, 2022

PR Description

Related Issues

Checklist

Optional section (e.g., code usage, experimental results, TODO)

review-notebook-app bot commented Jun 11, 2022