-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance on MVBench #4
Comments
Hi @NIneeeeeem, I appreciate your interest in our work. Please share the exact steps you followed to reproduce our work. For example, which model weights are you using? What command you are using for running inference and evaluation. Further, note that the scripts provided in our repository are not tested for batch sizes > 1 and are not guaranteed to be working properly. I would highly recommend to keep the |
Hi, here is the command:
|
Hi @NIneeeeeem, Thank you for providing the inference command you are using. Please note that our experiments use the Please try replacing the 128K context model with 4K context model and this should solve the issue. Good Luck! |
Hi @NIneeeeeem, These design choices are selected to optimize the training time and performance for both benchmarks. |
@mmaaz60 Thank you for your reply. I think I didn't state my question clearly. For datasets like SSV2, it contains 220,847 videos, of which 168,913 samples are used as the training set. And 40,000 were selected for the IT dataset in VideoGPT-plus. I am curious about the basis for such selection. |
Hi @NIneeeeeem, Thanks for the clarification. We follow the splits proposed in MVBench for training VideoChat2. I hope it answers your question. |
Thank you, my issue has been resolved. |
Thanks to open source for this exciting work!
I reproduced the performance on MVBench with a single gpu, but three experiments did not achieve the expected results in the paper (best results below), if any other weights were used in the tests. Also I noticed that when setting batch_size_per_gpu=2 drastically affects the performance and there is no OOM.
All Acc: [67.5, 57.99999999999999, 80.3030303030303, 48.5, 56.49999999999999, 86.5, 73.86934673366834, 37.0, 30.0, 30.5, 85.0, 38.0, 65.0, 82.5, 42.5, 51.0, 48.0, 31.0, 40.0, 56.00000000000001]% Total Acc: 55.37%
The text was updated successfully, but these errors were encountered: