-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tokenization mismatch in LLaVA-LLaMA-3 #14
Comments
Hi @Luo-Z13, Thank you for your interest in our work. Could you please please confirm which This issue arises when either you are using wrong base LLM or have set a different value for Let me know if the issue persists. Thank You |
Thank you for the reminder, I have found the reason: llama-3 has once updated the |
Hi @Luo-Z13,
pip's dependency
can be ignored.TypeError: pad_sequence(): argument 'padding_value' (position 3) must be float, not NoneType
occurs duringLLaMA-3
based model training. Actually LLaMA-3 does not use anypad
token however during LLaVA-LLaMA-3 training we need pad token. So the workaround is to add a special token and resize the embeddings. This is done atLLaVA-pp/LLaMA-3-V/train.py
Line 1015 in b93d9c8
Please make sure that baseline official LLaVA code is working properly. And then make sure to copy all the files related to LLaMA-3 in the corresponding directory. Lastly please note that to run LLaMA-3 based training you need to pass
--version llama3
.I hope it will help and solve the issue. Good Luck.
Originally posted by @mmaaz60 in #8 (comment)
Thank you very much, my previously reported
TypeError: pad_sequence(): argument 'padding_value' (position 3) must be float, not NoneType
issue has been resolved after correctly copying the right train.py file. Thanks for your advice on that matter.However, I still encounter
tokenization mismatch
issue during training, my current environment:And the beginning of the training output is as follows:
The text was updated successfully, but these errors were encountered: