adding the triton docker build minimal example #242

amirarsalan90 · 2024-02-28T03:37:13Z

Adding a minimal example to build docker container to serve sglang with triton inference server using python backend.

isaac-vidas · 2024-02-28T04:15:18Z

Thanks for this example!

Would it be possible to run the server from inside the model.py file with:

runtime = sgl.Runtime(model_path="mistralai/Mistral-7B-Instruct-v0.2")
sgl.set_default_backend(runtime)

amirarsalan90 · 2024-02-28T05:18:42Z

tritonserver --model-repository=/path/to/model/repository command halts when I try to run server in model.py with sgl.Runtime to load mistralai/Mistral-7B-Instruct-v0.2. I didn't try models other than mistralai/Mistral-7B-Instruct-v0.2 though

merrymercy · 2024-03-11T02:02:28Z

@amirarsalan90 Thanks for contributing to this! Could you document the files better?

Why is the folder name called "1"?
What is the purpose of examples/usage/triton/inference.ipynb? Can it be deleted?
Can you share an example command to query the triton server?

isaac-vidas · 2024-03-11T17:18:17Z

@merrymercy this implementation follows Triton's convention for model registry. See here for more details. The schema of the API inputs / outputs is specified in the config.pbtxt and the model itself is then placed under 1 folder next to it.
While this is a working solution, I suspect that this is mostly running SGLang as is behind Triton where it's still required to run the backend process independently.

A different approach to this could be to run as a triton backend similar to what vLLM do here. I suspect this would be a bit more involved with how SGLang creates the backend processes as part of the server but haven't looked into it too closely.

merging from main

amirarsalan90 · 2024-03-12T03:27:36Z

@merrymercy as @isaac-vidas explained, that is the directory convention for triton inference server for model registry. I removed the inference.ipynb notebook and added curl request to the readme file to query the triton server.

As far as I understand, vllm backend for triton inference server also disables some features of Triton (like batching) and has some limitations:
https://github.com/triton-inference-server/vllm_backend/blob/c1c88fa7dfbebcd3198ada913e127304d5ff0b46/src/model.py#L93

https://github.com/triton-inference-server/tutorials/blob/main/Quick_Deploy/vLLM/README.md

But I agree this is a very minimal and basic way to set up triton for sglang. I needed it for a project of mine, and thought it might be helpful for others too.

merrymercy · 2024-03-12T07:16:34Z

@amirarsalan90 It is merged. Thanks!

adding the triton docker build minimal example

91269fd

amirarsalan90 mentioned this pull request Feb 28, 2024

Triton support #35

Closed

merrymercy self-assigned this Mar 11, 2024

amirarsalan90 added 3 commits March 11, 2024 22:54

Merge remote-tracking branch 'upstream/main'

d413e03

merging from main

remove inference.ipynb and add curl request to triton server

aece0c9

fix endpoint

1b9bbbd

merrymercy merged commit eb4308c into sgl-project:main Mar 12, 2024

lucasavila00 mentioned this pull request Mar 12, 2024

Cannot Execute Runtime Directly in Docker, with local install #274

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

adding the triton docker build minimal example #242

adding the triton docker build minimal example #242

amirarsalan90 commented Feb 28, 2024

isaac-vidas commented Feb 28, 2024 •

edited

Loading

amirarsalan90 commented Feb 28, 2024 •

edited

Loading

merrymercy commented Mar 11, 2024

isaac-vidas commented Mar 11, 2024

amirarsalan90 commented Mar 12, 2024

merrymercy commented Mar 12, 2024

adding the triton docker build minimal example #242

adding the triton docker build minimal example #242

Conversation

amirarsalan90 commented Feb 28, 2024

isaac-vidas commented Feb 28, 2024 • edited Loading

amirarsalan90 commented Feb 28, 2024 • edited Loading

merrymercy commented Mar 11, 2024

isaac-vidas commented Mar 11, 2024

amirarsalan90 commented Mar 12, 2024

merrymercy commented Mar 12, 2024

isaac-vidas commented Feb 28, 2024 •

edited

Loading

amirarsalan90 commented Feb 28, 2024 •

edited

Loading