-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Serve][Doc] Triton server integration #41923
Conversation
Signed-off-by: Sihan Wang <[email protected]>
This guide shows how to serve models with [NVIDIA Triton Server](https://github.com/triton-inference-server/server) using Ray Serve. | ||
|
||
## Installation | ||
Here is the Dockerfile example for installing Triton Server with Ray Serve. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wait for dockerfile from nvidia.
Signed-off-by: Sihan Wang <[email protected]>
Ping me when ready for review |
Signed-off-by: Sihan Wang <[email protected]>
Sounds good, let me know if you think anything needed to be added. The major thing waiting from nvidia is Dockerfile part. |
Signed-off-by: Sihan Wang <[email protected]>
Signed-off-by: Sihan Wang <[email protected]>
Signed-off-by: Sihan Wang <[email protected]>
Signed-off-by: Sihan Wang <[email protected]>
@akshay-anyscale PTAL |
|
||
|
||
## Start Ray Serve with the Triton Server | ||
Triton Server provides python API to start the Triton Server instance. You can use the `nvcr.io/nvidia/tritonserver:23.12-py3` image which already have the Triton Server python API library installed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add the complete instructions for how to build an image that has both triton and ray?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
Signed-off-by: Sihan Wang <[email protected]>
Hi @edoakes, the pr is updated, PTAL. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just nits
Signed-off-by: Sihan Wang <[email protected]>
--------- Signed-off-by: Sihan Wang <[email protected]>
cherrypick #41923 --------- Signed-off-by: Sihan Wang <[email protected]>
Why are these changes needed?
Related issue number
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.