-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
apple silicon GPU thread count correct? #257
Comments
wow. As apple-lover without strong CS background, I hope master Georgi could work on more projects for apple users to use M2 ultra (maybe 192G) in order to tap into the potential of real LLMs (over 100B parameters). Is it possible in near future? Thanks to any masterminds! |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I've read with extreme interest the efforts to use the Apple Silicon GPUs with ggml. However, I think there might be some mis-understanding. I hope it's me, but if it's not, there's a HUGE performance gain possible.
Several of the ggml benchmarks floating around seem to indicate 1 thread per GPU core. which is nice and all, but we can do orders of magnitude better:
Wikipedia: Apple M2 GPU
Which, if I understand correctly, means we should be able to launch 128 threads per GPU core. And with Apple Silicons' Unified memory, they'll all have direct access to all of system RAM.
Am I missing something here? I'm really conflicted... I want to be right, but I highly doubt this was missed.
Cc: @ggerganov
The text was updated successfully, but these errors were encountered: