[release] update torch_tune_serve_test to use anyscale connect #16754

matthewdeng · 2021-06-29T20:14:37Z

Why are these changes needed?

As part of the Release Test process, these tests are being migrated to use the Anyscale connect API.

This Golden Notebook test was originally added in #16619.

Note

get_best_model is wrapped in ray.remote so that the checkpoint is loaded from the head node.

Related issue number

n/a

Checks

e2e job

I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

release/golden_notebook_tests/workloads/torch_tune_serve_test.py

amogkam · 2021-06-29T20:28:03Z

release/golden_notebook_tests/workloads/torch_tune_serve_test.py

@@ -186,7 +189,8 @@ def test_predictions(test_mode=False):

 print("Retrieving best model.")
 best_checkpoint = analysis.best_checkpoint
- model_id = get_best_model(best_checkpoint)
+ get_best_model_remote = ray.remote(get_best_model)


Is it possible to test out the ray.client().download_results() workflow here instead of wrapping in a task?

…h-tune-serve

amogkam

Nice! Is this working with the release tool?

krfricke

Looks good! I was just wondering if this script also works with Ray OSS.

krfricke · 2021-07-06T07:56:42Z

release/golden_notebook_tests/workloads/torch_tune_serve_test.py

- model_state = torch.load(best_model_checkpoint_path)
+def get_remote_model(remote_model_checkpoint_path):
+ address = os.environ.get("RAY_ADDRESS")
+ if address is not None and address.startswith("anyscale:https://"):


Would this also work with the Ray OSS client?

Yep, for OSS this would go to the else block and use ray.remote to fetch the model. I updated the code with some better path isolation and successfully ran this script on a Ray cluster!

* [release] update torch_tune_serve_test to use anyscale connect * use download_results to download model checkpoint * clean up code to support both OSS and Anyscale

[release] update torch_tune_serve_test to use anyscale connect

1bb42d9

matthewdeng assigned amogkam and krfricke Jun 29, 2021

amogkam reviewed Jun 29, 2021

View reviewed changes

release/golden_notebook_tests/workloads/torch_tune_serve_test.py Outdated Show resolved Hide resolved

amogkam reviewed Jun 29, 2021

View reviewed changes

matthewdeng added 2 commits July 1, 2021 18:24

use download_results to download model checkpoint

55dc78f

Merge branch 'master' of https://github.com/ray-project/ray into torc…

c3cd376

…h-tune-serve

amogkam approved these changes Jul 3, 2021

View reviewed changes

krfricke approved these changes Jul 6, 2021

View reviewed changes

clean up code to support both OSS and Anyscale

9942402

amogkam merged commit 23088bd into ray-project:master Jul 7, 2021

matthewdeng deleted the torch-tune-serve branch July 7, 2021 02:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[release] update torch_tune_serve_test to use anyscale connect #16754

[release] update torch_tune_serve_test to use anyscale connect #16754

matthewdeng commented Jun 29, 2021

amogkam Jun 29, 2021 •

edited

Loading

amogkam left a comment

krfricke left a comment

krfricke Jul 6, 2021

matthewdeng Jul 7, 2021

[release] update torch_tune_serve_test to use anyscale connect #16754

[release] update torch_tune_serve_test to use anyscale connect #16754

Conversation

matthewdeng commented Jun 29, 2021

Why are these changes needed?

Note

Related issue number

Checks

amogkam Jun 29, 2021 • edited Loading

Choose a reason for hiding this comment

amogkam left a comment

Choose a reason for hiding this comment

krfricke left a comment

Choose a reason for hiding this comment

krfricke Jul 6, 2021

Choose a reason for hiding this comment

matthewdeng Jul 7, 2021

Choose a reason for hiding this comment

amogkam Jun 29, 2021 •

edited

Loading