-
Notifications
You must be signed in to change notification settings - Fork 343
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Does llava-next-video deploy only focus on first frames? #510
Comments
I'm having a similar problem to you.
|
Indeed, our first version code patch has the mentioned issue. We will send a new PR along with our new models to fix above mentioned issues. Sorry for keep you waiting. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I'm trying to deploy llava-next-video with sglang, and it can successfully work. But I find it only focus on the first frame of input, like if I input 10 frames, and let model to describe it. And the generation only contains first frame's information. Dose anyone know what happend? Thanks~
Also, where can I print the input token for model? I want to check if all frames are input to model
The text was updated successfully, but these errors were encountered: