-
Notifications
You must be signed in to change notification settings - Fork 228
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Yi-VL Model #112
Yi-VL Model #112
Conversation
@BabyChouSr I ran python test_openai_server.py --test-image, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
- Could you also convert the 34B version?
- Could you upload the conversion script?
- Could you add an example similar to this one https://github.com/sgl-project/sglang/blob/main/examples/quick_start/srt_example_llava.py, but for Yi-VL?
try an example in srt_example_llava.py state = image_qa.run( image_path="images/cat.jpeg", question="What is this?", max_new_tokens=64) then
|
I don't know the placeholder for Yi models. it can be different but the key part is the respective token being missing most probably since the error is caused by |
I believe that the sglang frontend language (using the example from |
If so, why does "64002 is not in list" happen? |
Are you using the following for your runtime?
|
I used the runtime of endpoint. It failed with same error using cli above. |
@paulcx I posted some new changes, try running:
|
will try later. btw i found yi-vl architecture and model class were not found in model runner by hf config only. so i changed with hard code for now. |
It works. Also works with:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@BabyChouSr Ready to be merged?
Yup! |
It looks like this PR has been merged to SGLang v0.1.11. |
What is your model_path and tokenizer_path? One reason I can think of is that the 34B LLava uses Yi-Chat as the language model so the image token index is 64002, but vicuna-based language models will have image token index of 32000, thus causing the mismatch. |
@loveunk llava-1.6-34B is not the same as the Yi-VL in this PR, although they used the same base model Yi-34B. If you use sglang/python/sglang/lang/chat_template.py Lines 171 to 172 in ee1df26
For now, you can follow https://github.com/haotian-liu/LLaVA?tab=readme-ov-file#Demo to use llava 34B with SGLang. They handled chat template correctly in their interface. |
sglang=0.1.12 vllm=0.3.0
|
Adding support for the Yi-VL Model: https://huggingface.co/01-ai/Yi-VL-6B
Note, since the original repo does not have a very friendly format, I moved the files and created my own config which makes it more compatible with the SGLang codebase. This allows us to load the model, tokenizer, and processor without much code change.
To test, simply call:
Link to huggingface repo compatible with this commit:
6B model: https://huggingface.co/BabyChou/Yi-VL-6B
34B model: https://huggingface.co/BabyChou/Yi-VL-34B