Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[serve][docs] Add dev workflow page #27746

Merged
merged 33 commits into from
Aug 12, 2022

Conversation

architkulkarni
Copy link
Contributor

Why are these changes needed?

Adds a page describing a development workflow for Serve applications.

Related issue number

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

@architkulkarni
Copy link
Contributor Author

@shrekris-anyscale @edoakes @sihanwang41 @simon-mo This is still rough, but is ready for an initial content review.

I've left some questions and comments as "TODOs" that I need to resolve, but if you know the answer off the top of your head feel free to comment it. The code sample doesn't work at the moment, I think I'm doing something wrong with the ClassNode.

As a more general point, I don't know whether it's better to use serve.run(..bind(bind()) or to use dag.execute here. The two methods (dynamic dispatch with passing deployments into each other, vs. InputNode() call graph) don't seem compatible, in the sense that you wouldn't start with one of them for dev and switch to the other one for prod (you could, but you'd be rewriting a lot of code.)

doc/source/serve/dev-workflow.md Outdated Show resolved Hide resolved

Here we used the `execute` method to directly send a Python object to the deployment, rather than an HTTP request.

% TODO: We didn't use serve.run() like we said at the begining; should we?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's just use serve.run here instead of the dag execute thing for consistency. we can leave dag execute to the deployment graph docs

actually, maybe we should just not use deployment graph at all here and make it a simpler two-deployment app?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, maybe we should just not use deployment graph at all here and make it a simpler two-deployment app?

I think this is the key question, or at least the main thing I'm confused about. Are we saying that an expected part of the dev to prod transition is rewriting the code to go from serve.run to using a deployment graph?

How do you actually make a two-deployment pipeline app without using a deployment graph? You'd need to pass in the second deployment in a constructor, right? But to then later convert it to a graph, you'd need to rewrite all the deployment code to remove the deployments from the constructors and use InputNode, which seems like a lot of work.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another perspective, will using serve.run in dev slow down the velocity or execute will bring other benefits ? If not, I don't think we need to provide the execute information in the doc.

You can use the `serve run` CLI command to run and test your application locally using HTTP to send requests (similar to how you might use `uvicorn run` if you're familiar with Uvicorn):

```bash
serve run local_dev:HelloDeployment
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be nice to show the relevant info that gets printed to the console when you run this.

let's also mention that logs will be streamed. maybe make the deployment print something and show the output when you query it?

@edoakes edoakes added the @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. label Aug 10, 2022

Let's see a simple example.

```python
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move the code to the doc_code dir?


Here we used the `execute` method to directly send a Python object to the deployment, rather than an HTTP request.

% TODO: We didn't use serve.run() like we said at the begining; should we?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another perspective, will using serve.run in dev slow down the velocity or execute will bring other benefits ? If not, I don't think we need to provide the execute information in the doc.

doc/source/serve/dev-workflow.md Outdated Show resolved Hide resolved
doc/source/serve/dev-workflow.md Outdated Show resolved Hide resolved
doc/source/serve/dev-workflow.md Outdated Show resolved Hide resolved
doc/source/serve/dev-workflow.md Outdated Show resolved Hide resolved
doc/source/serve/dev-workflow.md Outdated Show resolved Hide resolved
Copy link
Contributor

@shrekris-anyscale shrekris-anyscale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work so far! I've added some comments. I think it would be helpful to include info in each section about why or why not a user might want to use that development style. If it's a short reason, we could include that info directly in the header, so users don't have to read info that isn't relevant to them.

doc/source/serve/dev-workflow.md Outdated Show resolved Hide resolved
doc/source/serve/dev-workflow.md Outdated Show resolved Hide resolved
doc/source/serve/dev-workflow.md Outdated Show resolved Hide resolved
doc/source/serve/dev-workflow.md Outdated Show resolved Hide resolved
@edoakes
Copy link
Contributor

edoakes commented Aug 11, 2022

@architkulkarni you have DCO and lint failures

doc/source/serve/dev-workflow.md Outdated Show resolved Hide resolved
doc/source/serve/dev-workflow.md Outdated Show resolved Hide resolved
doc/source/serve/dev-workflow.md Outdated Show resolved Hide resolved
doc/source/serve/dev-workflow.md Outdated Show resolved Hide resolved
doc/source/serve/dev-workflow.md Outdated Show resolved Hide resolved
doc/source/serve/dev-workflow.md Outdated Show resolved Hide resolved
doc/source/serve/dev-workflow.md Outdated Show resolved Hide resolved

When making the transition from your local machine to a remote cluster, you'll need to make sure your cluster has a similar environment to your local machine--files, environment variables, and Python packages, for example. During development, you can use {ref}`Runtime Environments<runtime-environments>` to manage this in a flexible way.

See [Ray Client](ray-client-under-construction) for more information on the Ray address specified here by the `--address` parameter.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to just explain how to build the address inline instead of linking to another guide. Also, is there a way to run this without Ray Client?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Under the hood, --address is passed to ray.init() on the laptop. If you pass in a ray:https:// address, it will use Ray Client. You could also specify a local Ray address, but for simplicity I'll leave that out of the guide.

doc/source/serve/dev-workflow.md Outdated Show resolved Hide resolved
Let's see a simple example.

```bash
serve run --address=ray:https://<cluster-ip-address>:10001 --runtime-env-json='{"env_vars": {"MY_ENV_VAR": "my-value"}, "working_dir": "./project/src", "pip": ["requests", "chess"]}' local_dev:HelloDeployment
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we split this example into multiple steps to make it more digestible? Concrete suggestion:

  1. First explain how to run Serve on a remote cluster, no runtime env involved.
  2. Then show the common case that's currently mentioned at the end: "A common pattern is to use the root directory of your project as the working_dir of your runtime environment when testing on a remote cluster."
  3. End with the advanced case where a full runtime env json / yaml is used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For step 1, I don't think there's a dev workflow that doesn't use runtime_env, because we need runtime_env to get the local code onto the cluster.

Why don't I try this ordering to make it more digestible:

  1. Just use --working-dir
  2. Use a full --runtime-env-json
  3. Mention runtime_env YAML

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see. Yes, that flow sounds good, thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I gave this a shot, let me know how it looks

@stephanie-wang stephanie-wang added the @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. label Aug 11, 2022
@stephanie-wang stephanie-wang self-assigned this Aug 11, 2022
@architkulkarni
Copy link
Contributor Author

@stephanie-wang Thanks for the review! Addressed/responded to the comments, could you please take another look?

@architkulkarni architkulkarni removed the @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. label Aug 11, 2022
Comment on lines 82 to 99
When making the transition from your local machine to a remote cluster, you'll need to make sure your cluster has a similar environment to your local machine--files, environment variables, and Python packages, for example. During development, you can use {ref}`Runtime Environments<runtime-environments>` to manage these dependencies.

Let's see a simple example. Run the following command on your local machine, with your remote cluster head node IP address substituted for `<head-node-ip-address>` in the command:

```bash
serve run --address=ray:https://<head-node-ip-address>:10001 --working_dir="./project/src" local_dev:HelloDeployment
```

This will connect to the remote cluster via Ray Client, upload the `working_dir` directory, and run your serve application. Here, the local directory specified by `working_dir` must contain `local_dev.py` so that it can be uploaded to the cluster and imported by Ray Serve.

Once this is up and running, we can send requests to the application:

```bash
curl -X PUT http:https://<head-node-ip-address>:8000/?name=Ray
# Hello, Ray! Hello, Ray!
```

If your runtime environment contains fields other than `working_dir`, use the `--runtime-env-json` argument:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
When making the transition from your local machine to a remote cluster, you'll need to make sure your cluster has a similar environment to your local machine--files, environment variables, and Python packages, for example. During development, you can use {ref}`Runtime Environments<runtime-environments>` to manage these dependencies.
Let's see a simple example. Run the following command on your local machine, with your remote cluster head node IP address substituted for `<head-node-ip-address>` in the command:
```bash
serve run --address=ray:https://<head-node-ip-address>:10001 --working_dir="./project/src" local_dev:HelloDeployment
```
This will connect to the remote cluster via Ray Client, upload the `working_dir` directory, and run your serve application. Here, the local directory specified by `working_dir` must contain `local_dev.py` so that it can be uploaded to the cluster and imported by Ray Serve.
Once this is up and running, we can send requests to the application:
```bash
curl -X PUT http:https://<head-node-ip-address>:8000/?name=Ray
# Hello, Ray! Hello, Ray!
```
If your runtime environment contains fields other than `working_dir`, use the `--runtime-env-json` argument:
When making the transition from your local machine to a remote cluster, you'll need to make sure your cluster has a similar environment to your local machine--files, environment variables, and Python packages, for example.
Let's see a simple example that just packages the code. Run the following command on your local machine, with your remote cluster head node IP address substituted for `<head-node-ip-address>` in the command:
```bash
serve run --address=ray:https://<head-node-ip-address>:10001 --working_dir="<directory containing local_dev.py>" local_dev:HelloDeployment

This will connect to the remote cluster via Ray Client, upload the directory specified by --working_dir, and run your Serve application in this directory. We can make local_dev.py importable by Ray Serve by making sure that it is contained in the directory passed to --working-dir.

Once this is up and running, we can send requests to the application:

curl -X PUT http:https://<head-node-ip-address>:8000/?name=Ray
# Hello, Ray! Hello, Ray!

For more complex dependencies, including files outside the working directory, environment variables, and Python packages, you can use {ref}Runtime Environments<runtime-environments> to manage these dependencies. Here is an example using the --runtime-env-json argument:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh no, I think my suggestion got mis-formatted by GitHub 🗡️

But basically I was trying to:

  1. move the mention of runtime envs closer to where it's actually used in a command (for a first-time user, I don't think it will be obvious that --working-dir is just syntactic sugar for a runtime env)
  2. explain semantics of --working-dir

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense!

Copy link
Contributor

@stephanie-wang stephanie-wang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes, it looks great! Left one more suggestion.

@edoakes edoakes merged commit 92e315f into ray-project:master Aug 12, 2022
gramhagen pushed a commit to gramhagen/ray that referenced this pull request Aug 15, 2022
Adds a page describing a development workflow for Serve applications.

Co-authored-by: Edward Oakes <[email protected]>
Co-authored-by: shrekris-anyscale <[email protected]>
Co-authored-by: Stephanie Wang <[email protected]>
Stefan-1313 pushed a commit to Stefan-1313/ray_mod that referenced this pull request Aug 18, 2022
Adds a page describing a development workflow for Serve applications.

Co-authored-by: Edward Oakes <[email protected]>
Co-authored-by: shrekris-anyscale <[email protected]>
Co-authored-by: Stephanie Wang <[email protected]>
Signed-off-by: Stefan van der Kleij <[email protected]>
JiahaoYao pushed a commit to JiahaoYao/ray that referenced this pull request Aug 21, 2022
Adds a page describing a development workflow for Serve applications.

Co-authored-by: Edward Oakes <[email protected]>
Co-authored-by: shrekris-anyscale <[email protected]>
Co-authored-by: Stephanie Wang <[email protected]>
JiahaoYao pushed a commit to JiahaoYao/ray that referenced this pull request Aug 22, 2022
Adds a page describing a development workflow for Serve applications.

Co-authored-by: Edward Oakes <[email protected]>
Co-authored-by: shrekris-anyscale <[email protected]>
Co-authored-by: Stephanie Wang <[email protected]>
JiahaoYao pushed a commit to JiahaoYao/ray that referenced this pull request Aug 22, 2022
Adds a page describing a development workflow for Serve applications.

Co-authored-by: Edward Oakes <[email protected]>
Co-authored-by: shrekris-anyscale <[email protected]>
Co-authored-by: Stephanie Wang <[email protected]>
ArturNiederfahrenhorst pushed a commit to ArturNiederfahrenhorst/ray that referenced this pull request Sep 1, 2022
Adds a page describing a development workflow for Serve applications.

Co-authored-by: Edward Oakes <[email protected]>
Co-authored-by: shrekris-anyscale <[email protected]>
Co-authored-by: Stephanie Wang <[email protected]>
Signed-off-by: Artur Niederfahrenhorst <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants