Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why GPT-J performs better on graviton without using simd than x86 using simd #520

Open
xshen053 opened this issue Sep 13, 2023 · 0 comments

Comments

@xshen053
Copy link

I run gpt-j -t 1 -m ../build/models/gpt-j-6B/ggml-model.bin -p "This is an example" on both c6i.8xlarge and 'c7g.8xlarge'.

Graviton (c7g.8xlarge)

by defaultGGML_SIMD is not enabled on graviton, I got these results (GGML_SIMD disabled)


Threads: 4 | Average ms/token: 298.38666666666666666666
Threads: 16 | Average ms/token: 80.54333333333333333333
Threads: 32 | Average ms/token: 59.32333333333333333333

GGML_SIMD enabled

Threads: 4 | Average ms/token: 131.37666666666666666666
Threads: 16 | Average ms/token: 63.95666666666666666666
Threads: 32 | Average ms/token: 54.81000000000000000000

x86 intel (c6i.8xlarge)

On intel x86 instance, I got these (GGML_SIMD enabled)

Threads: 4 | Average ms/token: 270.19333333333333333333
Threads: 16 | Average ms/token: 105.40000000000000000000
Threads: 32 | Average ms/token: 97.06333333333333333333

Why intel with simd is not as good as arm without simd??

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant