[rllib] Basic port of baselines/deepq to rllib #709

ericl · 2017-07-05T02:36:40Z

This is a straightforward adaptation of the baselines DQN implementation to conform to the RLlib API. Files to pay attention to are rllib/dqn/dqn.py and rllib/dqn/example.py; the rest were mostly copied with linter fixes only.

I also fixed up the licensing here by appending the OpenAI MIT license to the top-level LICENSE file.

I have a couple ideas on how to parallelize this with Ray in a followup PR:

First, we can parallelize rollouts, however to preserve algorithm semantics this requires train_freq to be large enough to allow sufficient parallelism between training steps. Increasing train_freq will probably also require an equivalent increase of batch_size.
Second, we can parallelize the optimization step. This also requires the batch_size parameter to be increased. We might also consider multiple steps of optimization over replay buffer samples, similar to policy gradient.

There is also literature on parallelizing DQN in other ways but that might be out of scope for now.

On a GPU instance the Pong example spends about equal time in training and rollouts, so both could be potentially valuable.

cc @pcmoritz @royf

AmplabJenkins · 2017-07-05T02:46:44Z

Merged build finished. Test PASSed.

AmplabJenkins · 2017-07-05T02:46:45Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/1177/
Test PASSed.

AmplabJenkins · 2017-07-07T15:18:32Z

Merged build finished. Test PASSed.

AmplabJenkins · 2017-07-07T15:18:33Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/1195/
Test PASSed.

AmplabJenkins · 2017-07-07T16:21:50Z

Merged build finished. Test PASSed.

AmplabJenkins · 2017-07-07T16:21:51Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/1199/
Test PASSed.

ericl added 25 commits June 23, 2017 23:40

rllib v0

a9d1994

fix imports

8eb9682

lint

48cdefc

comments

198a6a4

update docs

53ec755

a3c wip

ede03c7

Merge remote-tracking branch 'upstream/master' into rllib-a3c

8c124eb

a3c wip

585d583

report stats

7218899

update doc

6113213

add common logdir attr

afc45b7

name is too long

f86c3fc

Merge branch 'rllib-a3c' into rllib-logdir

f191156

fix small bug

173aac1

propagate exception on error

f34a2fb

fetch metrics

d631bdd

Merge branch 'rllib-a3c' into rllib-logdir

95a89e7

Merge remote-tracking branch 'upstream/master' into rllib-logdir

65561e8

initial port

44dc6c5

fix lint

971da00

add right license

d57a23d

port to common alg format

46e5cfd

fix lint

1785798

Merge remote-tracking branch 'upstream/master' into rllib-dqn

a68d590

rename dqn

c820381

add imports from future

81cfea5

fix lint

e1ff398

pcmoritz merged commit f012e59 into ray-project:master Jul 7, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[rllib] Basic port of baselines/deepq to rllib #709

[rllib] Basic port of baselines/deepq to rllib #709

ericl commented Jul 5, 2017 •

edited

Loading

AmplabJenkins commented Jul 5, 2017

AmplabJenkins commented Jul 5, 2017

AmplabJenkins commented Jul 7, 2017

AmplabJenkins commented Jul 7, 2017

AmplabJenkins commented Jul 7, 2017

AmplabJenkins commented Jul 7, 2017

[rllib] Basic port of baselines/deepq to rllib #709

[rllib] Basic port of baselines/deepq to rllib #709

Conversation

ericl commented Jul 5, 2017 • edited Loading

AmplabJenkins commented Jul 5, 2017

AmplabJenkins commented Jul 5, 2017

AmplabJenkins commented Jul 7, 2017

AmplabJenkins commented Jul 7, 2017

AmplabJenkins commented Jul 7, 2017

AmplabJenkins commented Jul 7, 2017

ericl commented Jul 5, 2017 •

edited

Loading