Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase MaxCommandBufferCount in ops_metal to 1024 #1664

Merged
merged 1 commit into from
Aug 24, 2023

Conversation

chenyuxyz
Copy link
Collaborator

This should fix the Metal blocking issue. From doc
"Each command queue has a fixed number of command buffers for its lifetime (see makeCommandQueue(maxCommandBufferCount:)). This method blocks the calling CPU thread when the queue doesn’t have any free command buffers, and returns after the GPU finishes executing one."

The default newCommandQueue set the number to 64 (doc). This PR increases it to 1024. I could not find a guidance on the limit of this number. A stackoverflow comment claimed RealityKit is using 1024. 1024 works on my M1 Max and it's enough for stable diffusion and llama.

JIT=1 python -O examples/llama.py --prompt "Hello." --count 10 --temperature=0 --timing
using METAL backend
using LLaMA-7B model
ram used: 13.48 GB, freqs_cis                                         : 100%|█| 292/292 [00:01<00:00, 153.
loaded weights in 1915.31 ms, 13.48 GB loaded at 7.04 GB/s
Hello.
ran model in 5807.81 ms
sync in 187.28 ms
 I
ran model in 217.31 ms
sync in 7.55 ms
'
ran model in 145.67 ms
sync in 4.16 ms
m
ran model in 26.82 ms
sync in 61.14 ms
 a
ran model in 21.98 ms
sync in 64.84 ms

ran model in 20.41 ms
sync in 66.90 ms
2
ran model in 19.68 ms
sync in 68.31 ms
0
ran model in 19.92 ms
sync in 68.15 ms
 year
ran model in 20.21 ms
sync in 68.86 ms
 old
ran model in 19.64 ms
sync in 69.37 ms
 male

@tinyb0t
Copy link

tinyb0t commented Aug 24, 2023

Changes made in tinygrad/:

------------------------------------------------------------
files                             insertions       deletions
------------------------------------------------------------
tinygrad/runtime/ops_metal.py              1               1
------------------------------------------------------------
lines added in the tinygrad folder: 0

@geohot
Copy link
Collaborator

geohot commented Aug 24, 2023

Nice find!

@geohot geohot merged commit f00325e into tinygrad:master Aug 24, 2023
13 checks passed
@chenyuxyz chenyuxyz deleted the metal-max-buffer branch September 1, 2023 17:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants