Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RLlib] Introduce Connector Metrics #31618

Merged
merged 9 commits into from
Jan 18, 2023

Conversation

ArturNiederfahrenhorst
Copy link
Contributor

Signed-off-by: Artur Niederfahrenhorst [email protected]

Why are these changes needed?

In order to properly better understand the influence of connector runtime, this PR introduces a first set of metrics and makes the necessary changes to support easy addition of more metrics related to connectors.

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

@@ -19,10 +21,17 @@
class ActionConnectorPipeline(ConnectorPipeline, ActionConnector):
def __init__(self, ctx: ConnectorContext, connectors: List[Connector]):
super().__init__(ctx, connectors)
self.timers = defaultdict(_Timer)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about putting this in the ConnectorPipeline class but it has no functionality related to timers if I do that and it would make the inheritance model which we have here even weirder.

Signed-off-by: Artur Niederfahrenhorst <[email protected]>
Signed-off-by: Artur Niederfahrenhorst <[email protected]>
Signed-off-by: Artur Niederfahrenhorst <[email protected]>
Signed-off-by: Artur Niederfahrenhorst <[email protected]>
Signed-off-by: Artur Niederfahrenhorst <[email protected]>
@ArturNiederfahrenhorst ArturNiederfahrenhorst added the tests-ok The tagger certifies test failures are unrelated and assumes personal liability. label Jan 14, 2023
Copy link
Member

@gjoliver gjoliver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have a high level questions first.
thanks for the nice PR!

self.timers = defaultdict(_Timer)

def reset(self, env_id: str):
self.timers.clear()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we don't want to reset timers.
like nobody is resetting action connector right now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed, thanks!

rllib/connectors/agent/pipeline.py Outdated Show resolved Hide resolved
# Create connector metrics
connector_metrics = {}
for policy_id, policy in self._worker.policy_map.items():
connector_metrics[policy_id] = policy.get_connector_throughput_metrics()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could be pretty costly. any chance we can only update the policies that are used for these episodes?
at the very least, let's only update the cached poclies.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed, thanks! 👍

Signed-off-by: Artur Niederfahrenhorst <[email protected]>
Signed-off-by: Artur Niederfahrenhorst <[email protected]>
Signed-off-by: Artur Niederfahrenhorst <[email protected]>
@gjoliver
Copy link
Member

a lot of test failure. can you take a look / re-trigger?

Signed-off-by: Artur Niederfahrenhorst <[email protected]>
Copy link
Member

@gjoliver gjoliver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks awesome now. thanks. gonna merge.

@gjoliver gjoliver merged commit 2d774e2 into ray-project:master Jan 18, 2023
andreapiso pushed a commit to andreapiso/ray that referenced this pull request Jan 22, 2023
Signed-off-by: Artur Niederfahrenhorst <[email protected]>
Signed-off-by: Andrea Pisoni <[email protected]>
cassidylaidlaw pushed a commit to cassidylaidlaw/ray that referenced this pull request Mar 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tests-ok The tagger certifies test failures are unrelated and assumes personal liability.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants