Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[all_tests][core][dashboard] Fix node logical resources reporting with autoscaler v2 #38440

Merged
merged 3 commits into from
Aug 16, 2023

Conversation

rickyyx
Copy link
Contributor

@rickyyx rickyyx commented Aug 15, 2023

Why are these changes needed?

With autoscaler v2, we no longer have monitor process writing the status into kv value store, instead, we could get the data from GCS directly through get_cluster_status, so we use that instead to populate logical resources info.

Added unit testing.

Related issue number

Closes #38404
Closes #38332

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Signed-off-by: rickyyx <[email protected]>
@rickyyx rickyyx changed the title [core][dashboard] Fix node logical resources reporting with autoscaler v2 [all_test][core][dashboard] Fix node logical resources reporting with autoscaler v2 Aug 15, 2023
@rickyyx rickyyx changed the title [all_test][core][dashboard] Fix node logical resources reporting with autoscaler v2 [all_tests][core][dashboard] Fix node logical resources reporting with autoscaler v2 Aug 15, 2023
parse_usage(usage_dict, verbose=True)
)

return per_node_resources
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a example data for per_node_resources? I would like to confirm the frontend would render as expected

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


{
    'nodeLogicalResources': {
            '6eb8d224b678a6793679cc189e7bc3ef787033005bc3de61ad3d29a2': '0.0/1.0 CPU\n0B/27.53GiB memory\n0B/13.76GiB object_store_memory'
    }
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also see the added e2e test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

def test_get_nodes_summary(call_ray_start):

# The sleep is needed since it seems a previous shutdown could be not yet
# done when the next test starts. This prevents a previous cluster to be
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! I believe I met the similar issue before

@chaowanggg
Copy link
Contributor

LGTM!

@rickyyx
Copy link
Contributor Author

rickyyx commented Aug 16, 2023

Release test with v2 working with logcial node rsources:
image

@rickyyx rickyyx merged commit 329b409 into ray-project:master Aug 16, 2023
113 of 128 checks passed
harborn pushed a commit to harborn/ray that referenced this pull request Aug 17, 2023
harborn pushed a commit to harborn/ray that referenced this pull request Aug 17, 2023
arvind-chandra pushed a commit to lmco/ray that referenced this pull request Aug 31, 2023
vymao pushed a commit to vymao/ray that referenced this pull request Oct 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants