Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Yi-VL Model #112

Merged
merged 12 commits into from
Feb 1, 2024
Merged

Yi-VL Model #112

merged 12 commits into from
Feb 1, 2024

Conversation

BabyChouSr
Copy link
Collaborator

@BabyChouSr BabyChouSr commented Jan 28, 2024

Adding support for the Yi-VL Model: https://huggingface.co/01-ai/Yi-VL-6B

Note, since the original repo does not have a very friendly format, I moved the files and created my own config which makes it more compatible with the SGLang codebase. This allows us to load the model, tokenizer, and processor without much code change.

To test, simply call:

runtime = sgl.Runtime(model_path="BabyChou/Yi-VL-6B",
                      tokenizer_path="BabyChou/Yi-VL-6B")

Link to huggingface repo compatible with this commit:
6B model: https://huggingface.co/BabyChou/Yi-VL-6B
34B model: https://huggingface.co/BabyChou/Yi-VL-34B

@merrymercy merrymercy mentioned this pull request Jan 30, 2024
@BabyChouSr BabyChouSr marked this pull request as ready for review January 30, 2024 18:44
@BabyChouSr BabyChouSr changed the title [WIP] Yi-VL Model Yi-VL Model Jan 30, 2024
@exceedzhang
Copy link
Contributor

@BabyChouSr I ran python test_openai_server.py --test-image,
test_chat_completion_image(args) error, following error:
image
image
Server Error:
image

Copy link
Contributor

@merrymercy merrymercy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

  1. Could you also convert the 34B version?
  2. Could you upload the conversion script?
  3. Could you add an example similar to this one https://github.com/sgl-project/sglang/blob/main/examples/quick_start/srt_example_llava.py, but for Yi-VL?

python/sglang/srt/utils.py Outdated Show resolved Hide resolved
@paulcx
Copy link

paulcx commented Jan 31, 2024

try an example in srt_example_llava.py

state = image_qa.run( image_path="images/cat.jpeg", question="What is this?", max_new_tokens=64) then

    File "xx/sglang/python/sglang/srt/models/llava.py", line 63, in pad_input_ids
        offset = input_ids.index(self.config.image_token_index)
ValueError: 64002 is not in list

@aliozts
Copy link

aliozts commented Jan 31, 2024

@paulcx I had the same issue, it's because of the prompting, you need to include the <image> from my understanding. I'd suggest referring to #41

@paulcx
Copy link

paulcx commented Jan 31, 2024

@paulcx I had the same issue, it's because of the prompting, you need to include the <image> from my understanding. I'd suggest referring to #41

Should it be <image_holder>?

@aliozts
Copy link

aliozts commented Jan 31, 2024

I don't know the placeholder for Yi models. it can be different but the key part is the respective token being missing most probably since the error is caused by input_ids.index(self.config.image_token_index) means that the respective token(64002)'s text version is missing.

@BabyChouSr
Copy link
Collaborator Author

@paulcx I had the same issue, it's because of the prompting, you need to include the <image> from my understanding. I'd suggest referring to #41

Should it be <image_holder>?

I believe that the sglang frontend language (using the example from srt_example_llava but changing llava for yi-vl model) will automatically add the image token which is <image_placeholder>

@paulcx
Copy link

paulcx commented Jan 31, 2024

@paulcx I had the same issue, it's because of the prompting, you need to include the <image> from my understanding. I'd suggest referring to #41

Should it be <image_holder>?

I believe that the sglang frontend language (using the example from srt_example_llava but changing llava for yi-vl model) will automatically add the image token which is <image_placeholder>

If so, why does "64002 is not in list" happen?

@BabyChouSr
Copy link
Collaborator Author

@paulcx I had the same issue, it's because of the prompting, you need to include the <image> from my understanding. I'd suggest referring to #41

Should it be <image_holder>?

I believe that the sglang frontend language (using the example from srt_example_llava but changing llava for yi-vl model) will automatically add the image token which is <image_placeholder>

If so, why does "64002 is not in list" happen?

Are you using the following for your runtime?

runtime = sgl.Runtime(model_path="BabyChou/Yi-VL-6B",
                      tokenizer_path="BabyChou/Yi-VL-6B")

@paulcx
Copy link

paulcx commented Jan 31, 2024

@paulcx I had the same issue, it's because of the prompting, you need to include the <image> from my understanding. I'd suggest referring to #41

Should it be <image_holder>?

I believe that the sglang frontend language (using the example from srt_example_llava but changing llava for yi-vl model) will automatically add the image token which is <image_placeholder>

If so, why does "64002 is not in list" happen?

Are you using the following for your runtime?


runtime = sgl.Runtime(model_path="BabyChou/Yi-VL-6B",

                      tokenizer_path="BabyChou/Yi-VL-6B")

I used the runtime of endpoint. It failed with same error using cli above.

@BabyChouSr
Copy link
Collaborator Author

@paulcx I posted some new changes, try running:

python3 srt_example_yi_vl.py

@paulcx
Copy link

paulcx commented Jan 31, 2024

@paulcx I posted some new changes, try running:


python3 srt_example_yi_vl.py

will try later. btw i found yi-vl architecture and model class were not found in model runner by hf config only. so i changed with hard code for now.

@paulcx
Copy link

paulcx commented Feb 1, 2024

@paulcx I posted some new changes, try running:

python3 srt_example_yi_vl.py

It works.

Also works with:

curl http:https://localhost/generate -H "Content-Type: application/json" -d '{"text": "### Human: <image_placeholder>\n图片里有什么?\n### Assistant:", "image_data": "xxxx", "sampling_params": {"max_new_tokens": 64, "temperature": 0, "stop": "### "}}'

Copy link
Contributor

@merrymercy merrymercy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BabyChouSr Ready to be merged?

@BabyChouSr
Copy link
Collaborator Author

@BabyChouSr Ready to be merged?

Yup!

@merrymercy merrymercy merged commit 8644253 into main Feb 1, 2024
@merrymercy merrymercy deleted the yi-vl branch February 1, 2024 16:33
@loveunk
Copy link

loveunk commented Feb 6, 2024

It looks like this PR has been merged to SGLang v0.1.11.
However, still encountered the ValueError: 64002 is not in list issue while performing inference with llava-1.6-34B and SGLang v0.1.11.

@BabyChouSr
Copy link
Collaborator Author

What is your model_path and tokenizer_path? One reason I can think of is that the 34B LLava uses Yi-Chat as the language model so the image token index is 64002, but vicuna-based language models will have image token index of 32000, thus causing the mismatch.

@merrymercy
Copy link
Contributor

merrymercy commented Feb 6, 2024

@loveunk llava-1.6-34B is not the same as the Yi-VL in this PR, although they used the same base model Yi-34B.

If you use sgl.user/sgl. assistant, our current chat templates only correctly handle Yi-VL and Llava-1.5.
For Llava 1.6, we need some additional handling. We can fix it soon or you can help us fix it. The related code is

if "llava" in model_path.lower():
return get_chat_template("vicuna_v1.1")
.

For now, you can follow https://github.com/haotian-liu/LLaVA?tab=readme-ov-file#Demo to use llava 34B with SGLang. They handled chat template correctly in their interface.

@Lzhang-hub
Copy link

sglang=0.1.12 vllm=0.3.0

python3 srt_example_yi_vl.py with yi-vl-34B and tp_size=2
get error:

Traceback (most recent call last):
  File "/home/work/l20/envs/sglang-env/lib/python3.10/site-packages/rpyc/core/protocol.py", line 369, in _dispatch_request
    res = self._HANDLERS[handler](self, *args)
  File "/home/work/l20/envs/sglang-env/lib/python3.10/site-packages/rpyc/core/protocol.py", line 863, in _handle_call
    return obj(*args, **dict(kwargs))
  File "/home/work/l20/envs/sglang-env/lib/python3.10/site-packages/sglang/srt/managers/router/model_rpc.py", line 62, in exposed_init_model
    self.model_runner = ModelRunner(
  File "/home/work/l20/envs/sglang-env/lib/python3.10/site-packages/sglang/srt/managers/router/model_runner.py", line 275, in __init__
    self.load_model()
  File "/home/work/l20/envs/sglang-env/lib/python3.10/site-packages/sglang/srt/managers/router/model_runner.py", line 308, in load_model
    model.load_weights(
  File "/home/work/l20/envs/sglang-env/lib/python3.10/site-packages/sglang/srt/models/yivl.py", line 85, in load_weights
    self.language_model.load_weights(
  File "/home/work/l20/envs/sglang-env/lib/python3.10/site-packages/sglang/srt/models/llama2.py", line 320, in load_weights
    weight_loader(param, loaded_weight)
  File "/home/work/l20/envs/sglang-env/lib/python3.10/site-packages/vllm/model_executor/layers/vocab_parallel_embedding.py", line 89, in weight_loader
    assert loaded_weight.shape[parallel_dim] == self.org_vocab_size
AssertionError

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants