[rllib] add augmented random search #2714

eugenevinitsky · 2018-08-22T18:59:06Z

What do these changes do?

Augmented random search is added as an algorithm: https://arxiv.org/pdf/1803.07055.pdf
Std deviation of meanstdfilter is initialized to 1

Related issue number

…itialize to 1

AmplabJenkins · 2018-08-22T20:31:35Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/7675/
Test PASSed.

AmplabJenkins · 2018-08-22T20:35:29Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/7674/
Test PASSed.

ericl

This looks pretty good, some minor comments.

ericl · 2018-08-23T00:15:40Z

python/ray/rllib/agents/ars/ars.py

@@ -0,0 +1,360 @@
+# Code in this file is copied and adapted from
+# https://github.com/openai/evolution-strategies-starter and from
+# https://github.com/modestyachts/ARS


Could you append https://github.com/modestyachts/ARS/blob/master/LICENSE to ray/LICENSE?

Ooh, good catch, thanks!

ericl · 2018-08-23T00:16:45Z

python/ray/rllib/agents/ars/ars.py

+
+ tlogger.record_tabular("TimeElapsedThisIter", step_tend - step_tstart)
+ tlogger.record_tabular("TimeElapsed", step_tend - self.tstart)
+ tlogger.dump_tabular()


Do we need the tlog stuff still? (I know it's in ES, but it seems unnecessary)

I find some of it useful, but I definitely will remove duplicates. Seeing the std deviation of the weights and the grad norm of the update are useful checks that things are going as planned for example.

ericl · 2018-08-23T00:17:26Z

python/ray/rllib/agents/ars/ars.py

+ "timesteps_this_iter": noisy_lengths.sum(),
+ "timesteps_so_far": self.timesteps_so_far,
+ "time_elapsed_this_iter": step_tend - step_tstart,
+ "time_elapsed": step_tend - self.tstart


I think we can drop all but the first 3 stats, since they're already calculated by rllib right?

Can you put this in the info field of the result instead?

ericl · 2018-08-23T00:18:25Z

python/ray/rllib/agents/ars/tabular_logger.py

+DISABLED = 50
+
+
+class TbWriter(object):


It would be great if we could not add this file.

ericl · 2018-08-23T00:19:07Z

python/ray/rllib/tuned_examples/regression_tests/cartpole-ars.yaml

@@ -0,0 +1,17 @@
+# can expect improvement to -140 reward in ~300-500k timesteps


Nice! Could we also add an entry for ARS in test_supported_spaces.py?

ericl · 2018-08-23T00:19:56Z

Lint failed: you should run scripts/yapf.sh to fix.

eugenevinitsky · 2018-08-24T02:01:55Z

I get: ./yapf.sh: line 51: mapfile: command not found
do you know what this might refer to?

I did run flake8 to check; that's not sufficient?

ericl · 2018-08-24T05:25:04Z

@eugenevinitsky that seems to be a compatibility problem with the script, this should fix it: #2735

otherwise, I can run it

…pull

eugenevinitsky · 2018-08-24T18:08:19Z

Addressed the other comments, still blocked on the yapf thing;
pulling in the fix leads to:
From https://github.com/ray-project/ray

branch master -> FETCH_HEAD
xargs: yapf: No such file or directory

AmplabJenkins · 2018-08-24T19:13:56Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/7734/
Test FAILed.

AmplabJenkins · 2018-08-24T19:54:27Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/7735/
Test FAILed.

…pull

AmplabJenkins · 2018-08-24T21:59:44Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/7739/
Test FAILed.

ericl

Looks good. Did you include the filter change though?

AmplabJenkins · 2018-08-24T23:08:55Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/7740/
Test FAILed.

eugenevinitsky · 2018-08-24T23:10:18Z

Yeah, it's changing np.zeros to np.ones in utils.filter

eugenevinitsky · 2018-08-24T23:15:22Z

python/ray/rllib/utils/filter.py

@@ -62,7 +62,7 @@ class RunningStat(object):
 def __init__(self, shape=None):
 self._n = 0
 self._M = np.zeros(shape)
- self._S = np.zeros(shape)
+ self._S = np.ones(shape)


The relevant filter change, I think.

eugenevinitsky · 2018-08-24T23:16:22Z

Thanks for those fixes btw

AmplabJenkins · 2018-08-24T23:49:40Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/7741/
Test FAILed.

AmplabJenkins · 2018-08-25T01:20:46Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/7743/
Test FAILed.

AmplabJenkins · 2018-08-25T01:30:04Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/7744/
Test FAILed.

AmplabJenkins · 2018-08-25T01:41:38Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/7745/
Test FAILed.

AmplabJenkins · 2018-08-25T02:47:34Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/7750/
Test PASSed.

AmplabJenkins · 2018-08-25T02:51:49Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/7749/
Test PASSed.

AmplabJenkins · 2018-08-25T04:14:08Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/7752/
Test PASSed.

ericl · 2018-08-25T05:19:38Z

Going to merge this sans the filter change, since that breaks some of the filter unit tests.

akaniklaus · 2019-01-14T10:53:24Z

Thanks a lot for this. Is there any particular reason why this is not utilizing GPU for computation?

eugenevinitsky added 10 commits July 3, 2018 15:15

added ars

bedbfdf

functioning ars with regression test

c93ad7c

added regression tests for ARs

55fddf6

minor

9307b3d

fixed default config for ARS

4b4c202

ARS code runs, now time to test

f47018e

ARS working and tested, changed std deviation of meanstd filter to in…

86de032

…itialize to 1

ARS working and tested, changed std deviation of meanstd filter to in…

bce8008

…itialize to 1

pep8 fixes

7b4434a

removed unused linear model

6d7a07c

robertnishihara changed the title ~~Ars pull~~ [rllib] add augmented random search Aug 22, 2018

ericl self-assigned this Aug 22, 2018

ericl reviewed Aug 23, 2018

View reviewed changes

address comments

0c3d8d8

eugenevinitsky added 2 commits August 24, 2018 11:03

Merge branch 'master' of https://github.com/ray-project/ray into ars_…

6b52448

…pull

more fixing comments

672d391

post yapf

593a43d

eugenevinitsky added 2 commits August 24, 2018 13:54

Merge branch 'master' of https://github.com/ray-project/ray into ars_…

b168e77

…pull

fixed support failure

b2b272a

Update LICENSE

44ca77f

ericl approved these changes Aug 24, 2018

View reviewed changes

Update policies.py

d003d37

eugenevinitsky commented Aug 24, 2018

View reviewed changes

ericl added 3 commits August 24, 2018 17:19

Update test_supported_spaces.py

0ab1599

Update policies.py

1328a05

Update LICENSE

0d1cea8

ericl added 2 commits August 24, 2018 18:33

Update test_supported_spaces.py

dbffe89

Update policies.py

1b0be6d

Update policies.py

da41a6f

Update filter.py

ccef489

ericl merged commit 6201a6d into ray-project:master Aug 25, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[rllib] add augmented random search #2714

[rllib] add augmented random search #2714

eugenevinitsky commented Aug 22, 2018

AmplabJenkins commented Aug 22, 2018

AmplabJenkins commented Aug 22, 2018

ericl left a comment

ericl Aug 23, 2018

eugenevinitsky Aug 24, 2018

ericl Aug 23, 2018

eugenevinitsky Aug 24, 2018 •

edited

Loading

ericl Aug 23, 2018

ericl Aug 24, 2018

ericl Aug 23, 2018

ericl Aug 23, 2018

ericl commented Aug 23, 2018

eugenevinitsky commented Aug 24, 2018 •

edited

Loading

ericl commented Aug 24, 2018

eugenevinitsky commented Aug 24, 2018 •

edited

Loading

AmplabJenkins commented Aug 24, 2018

AmplabJenkins commented Aug 24, 2018

AmplabJenkins commented Aug 24, 2018

ericl left a comment

AmplabJenkins commented Aug 24, 2018

eugenevinitsky commented Aug 24, 2018

eugenevinitsky Aug 24, 2018

eugenevinitsky commented Aug 24, 2018

AmplabJenkins commented Aug 24, 2018

AmplabJenkins commented Aug 25, 2018

AmplabJenkins commented Aug 25, 2018

AmplabJenkins commented Aug 25, 2018

AmplabJenkins commented Aug 25, 2018

AmplabJenkins commented Aug 25, 2018

AmplabJenkins commented Aug 25, 2018

ericl commented Aug 25, 2018

akaniklaus commented Jan 14, 2019

		@@ -0,0 +1,17 @@
		# can expect improvement to -140 reward in ~300-500k timesteps

[rllib] add augmented random search #2714

[rllib] add augmented random search #2714

Conversation

eugenevinitsky commented Aug 22, 2018

What do these changes do?

Related issue number

AmplabJenkins commented Aug 22, 2018

AmplabJenkins commented Aug 22, 2018

ericl left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eugenevinitsky Aug 24, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ericl commented Aug 23, 2018

eugenevinitsky commented Aug 24, 2018 • edited Loading

ericl commented Aug 24, 2018

eugenevinitsky commented Aug 24, 2018 • edited Loading

AmplabJenkins commented Aug 24, 2018

AmplabJenkins commented Aug 24, 2018

AmplabJenkins commented Aug 24, 2018

ericl left a comment

Choose a reason for hiding this comment

AmplabJenkins commented Aug 24, 2018

eugenevinitsky commented Aug 24, 2018

Choose a reason for hiding this comment

eugenevinitsky commented Aug 24, 2018

AmplabJenkins commented Aug 24, 2018

AmplabJenkins commented Aug 25, 2018

AmplabJenkins commented Aug 25, 2018

AmplabJenkins commented Aug 25, 2018

AmplabJenkins commented Aug 25, 2018

AmplabJenkins commented Aug 25, 2018

AmplabJenkins commented Aug 25, 2018

ericl commented Aug 25, 2018

akaniklaus commented Jan 14, 2019

eugenevinitsky Aug 24, 2018 •

edited

Loading

eugenevinitsky commented Aug 24, 2018 •

edited

Loading

eugenevinitsky commented Aug 24, 2018 •

edited

Loading