-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for Llama 3.1 model #36
Comments
we are supporting llama 3.1, please be patient, thanks~ |
LLM-TPU/models/Llama3_1/compile/export_onnx.py does not exist.. (according to documentation it should) pip install --upgrade transformers to version 4.44.0 Copying the one from Llama3 and running it ..is getting error In interim can you make the bmodel available Currently it says file not found. |
Are there instructions specific to creating a bmodel from onnx for Llama 3.1 (not lllam3)
Running this is erroring out.
python export_onnx.py --model_path ../../../../Meta-Llama-3.1-8B-Instruct/ --seq_length 1024
Convert block & block_cache
0%| | 0/32 [00:00<?, ?it/s]The attention layers in this model are transitioning from computing the RoPE embeddings internally through
position_ids
(2D tensor with the indexes of the tokens), to using externally computedposition_embeddings
(Tuple of tensors, containing cos and sin). In v4.45position_ids
will be removed andposition_embeddings
will be mandatory.The text was updated successfully, but these errors were encountered: