[RLLib] Make movement of tensors to device only happen once #36091

avnishn · 2023-06-05T22:41:22Z

Signed-off-by: avnishn [email protected]

Our current minibatching logic in the learner stack forces individual minibatches to be moved to the gpu after they have been sliced. This is wasteful since it creates unnecessary copies of batches, and adds unnecessary movements of the batch over to gpu. This pr addresses this by moving the whole batch to the gpu first, then doing any minibatching operations on it.

Why are these changes needed?

Related issue number

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
- I've added any new APIs to the API Reference. For example, if I added a
  method in Tune, I've added it in doc/source/tune/api/ under the
  corresponding .rst file.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

Signed-off-by: avnishn <[email protected]>

Signed-off-by: Avnish <[email protected]>

…appen once Signed-off-by: avnishn <[email protected]>

Signed-off-by: Avnish <[email protected]>

…into fix_minibatching_gpu

Signed-off-by: Avnish <[email protected]>

rllib/examples/learner/ppo_tuner.py

sven1977 · 2023-06-06T16:15:49Z

rllib/core/learner/learner.py

- for minibatch in batch_iter(batch, minibatch_size, num_iters):
- # Convert minibatch into a tensor batch (NestedDict).
- tensor_minibatch = self._convert_batch_type(minibatch)
+ # Convert minibatch into a tensor batch (NestedDict).


Can we add: "... on the correct device (e.g. GPU)"? This would clarify further.
Also note then that "we only perform copying to the correct device once so we do not have to move data in each minibatch iteration below". something like this.

rllib/core/learner/learner.py

rllib/core/learner/tf/tf_learner.py

sven1977 · 2023-06-06T16:17:08Z

rllib/utils/minibatch_utils.py

@@ -1,5 +1,6 @@
 import math

+


two empty lines?

I can get rid of these, they're left over from previous commits

sven1977 · 2023-06-06T16:17:53Z

rllib/policy/sample_batch.py

@@ -1631,7 +1635,22 @@ def _concat_values(*values, time_major=None) -> TensorType:
 time_major: Whether to concatenate along the first axis
 (time_major=False) or the second axis (time_major=True).
 """
- return np.concatenate(list(values), axis=1 if time_major else 0)
+ if torch and isinstance(values[0], torch.Tensor):


Can you explain why we need these changes (add comment)?

Also, let's use torch.is_tensor().

sven1977 · 2023-06-06T16:24:41Z

rllib/policy/sample_batch.py

+ return torch.cat(values, dim=1 if time_major else 0)
+ elif isinstance(values[0], np.ndarray):
+ return np.concatenate(values, axis=1 if time_major else 0)
+ elif tf and isinstance(values[0], tf.Tensor):


Can we use tf.is_tensor() instead?

sven1977

Looks great. Just a few nits and questions. Thanks for this important enhancement @avnishn !

sven1977

Looks great. Just a few nits and questions. Thanks for this important enhancement @avnishn !

Signed-off-by: Avnish <[email protected]>

…minibatching_gpu

rllib/core/learner/learner.py

Co-authored-by: Sven Mika <[email protected]> Signed-off-by: Avnish Narayan <[email protected]>

sven1977

LGTM. Let's wait for all tests to pass again, then, I'll merge. ...

…minibatching_gpu

rllib/core/learner/learner.py

Co-authored-by: kourosh hakhamaneshi <[email protected]> Signed-off-by: Avnish Narayan <[email protected]>

Signed-off-by: Avnish <[email protected]>

…ect#36091) Signed-off-by: Avnishn <[email protected]> Co-authored-by: Sven Mika <[email protected]> Co-authored-by: kourosh hakhamaneshi <[email protected]> Signed-off-by: e428265 <[email protected]>

avnishn added 3 commits June 5, 2023 14:06

Temp

2771992

Signed-off-by: avnishn <[email protected]>

Temp

614379e

Signed-off-by: Avnish <[email protected]>

[RLlib] Make movement of tensors to device during minibatching only h…

4b48b6b

…appen once Signed-off-by: avnishn <[email protected]>

avnishn requested review from sven1977, gjoliver, ArturNiederfahrenhorst, smorad, maxpumperla, kouroshHakha and krfricke as code owners June 5, 2023 22:41

avnishn changed the title ~~[RLLib] Make movement of tensors to device only happen once~~ [WIP][RLLib] Make movement of tensors to device only happen once Jun 5, 2023

avnishn added 2 commits June 5, 2023 22:02

Fix for when using dummy batch iterator

4e1c11b

Signed-off-by: Avnish <[email protected]>

Merge branch 'fix_minibatching_gpu' of https://github.com/avnishn/ray …

328140e

…into fix_minibatching_gpu

avnishn assigned sven1977 Jun 6, 2023

Move concat logic to concat values

2a772eb

Signed-off-by: Avnish <[email protected]>

sven1977 reviewed Jun 6, 2023

View reviewed changes

rllib/examples/learner/ppo_tuner.py Outdated Show resolved Hide resolved

sven1977 reviewed Jun 6, 2023

View reviewed changes

rllib/core/learner/learner.py Show resolved Hide resolved

sven1977 reviewed Jun 6, 2023

View reviewed changes

rllib/core/learner/tf/tf_learner.py Show resolved Hide resolved

sven1977 reviewed Jun 6, 2023

View reviewed changes

avnishn added 2 commits June 6, 2023 15:47

Address comments

1ee9364

Signed-off-by: Avnish <[email protected]>

Merge branch 'master' of https://github.com/ray-project/ray into fix_…

f302854

…minibatching_gpu

avnishn changed the title ~~[WIP][RLLib] Make movement of tensors to device only happen once~~ [RLLib] Make movement of tensors to device only happen once Jun 7, 2023

sven1977 reviewed Jun 7, 2023

View reviewed changes

rllib/core/learner/learner.py Outdated Show resolved Hide resolved

Update comment

28fc9f6

Co-authored-by: Sven Mika <[email protected]> Signed-off-by: Avnish Narayan <[email protected]>

sven1977 approved these changes Jun 7, 2023

View reviewed changes

Merge branch 'master' of https://github.com/ray-project/ray into fix_…

16d39b1

…minibatching_gpu

kouroshHakha approved these changes Jun 7, 2023

View reviewed changes

rllib/core/learner/learner.py Outdated Show resolved Hide resolved

avnishn and others added 3 commits June 7, 2023 17:13

Update comment

9bf07c0

Co-authored-by: kourosh hakhamaneshi <[email protected]> Signed-off-by: Avnish Narayan <[email protected]>

Fix lint

58e1553

Signed-off-by: Avnish <[email protected]>

Fix learner lint

890e428

Signed-off-by: Avnish <[email protected]>

kouroshHakha merged commit d950281 into ray-project:master Jun 8, 2023
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RLLib] Make movement of tensors to device only happen once #36091

[RLLib] Make movement of tensors to device only happen once #36091

avnishn commented Jun 5, 2023

sven1977 Jun 6, 2023

avnishn Jun 6, 2023

sven1977 Jun 6, 2023

avnishn Jun 6, 2023

sven1977 Jun 6, 2023

sven1977 Jun 6, 2023

avnishn Jun 6, 2023

sven1977 Jun 6, 2023

avnishn Jun 6, 2023

sven1977 left a comment

sven1977 left a comment

sven1977 left a comment

[RLLib] Make movement of tensors to device only happen once #36091

[RLLib] Make movement of tensors to device only happen once #36091

Conversation

avnishn commented Jun 5, 2023

Why are these changes needed?

Related issue number

Checks

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sven1977 left a comment

Choose a reason for hiding this comment

sven1977 left a comment

Choose a reason for hiding this comment

sven1977 left a comment

Choose a reason for hiding this comment