Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Garbage output on Metal on x86-64 mac #660

Open
kanav99 opened this issue Dec 20, 2023 · 0 comments
Open

Garbage output on Metal on x86-64 mac #660

kanav99 opened this issue Dec 20, 2023 · 0 comments

Comments

@kanav99
Copy link

kanav99 commented Dec 20, 2023

Hi, I get garbage when I run gpt2 with metal.

Here are the steps I took:

cmake -DGGML_METAL=ON -DBUILD_SHARED_LIBS=Off ..
make -j gpt-2-batched
./bin/gpt-2-batched -m models/gpt-2-117M/ggml-model.bin -p "This is an example" -ngl 1 -s 1703042754

Output:

main: seed = 1703042754
gpt2_model_load: loading model from 'models/gpt-2-117M/ggml-model.bin'
gpt2_model_load: n_vocab = 50257
gpt2_model_load: n_ctx   = 1024
gpt2_model_load: n_embd  = 768
gpt2_model_load: n_head  = 12
gpt2_model_load: n_layer = 12
gpt2_model_load: ftype   = 1
gpt2_model_load: qntvr   = 0
gpt2_model_load: ggml tensor size    = 384 bytes
gpt2_model_load: backend buffer size = 312.72 MB
gpt2_model_load: using Metal backend
ggml_metal_init: allocating
ggml_metal_init: found device: Intel(R) Iris(TM) Plus Graphics 655
ggml_metal_init: picking default device: Intel(R) Iris(TM) Plus Graphics 655
ggml_metal_init: default.metallib not found, loading from source
ggml_metal_init: GGML_METAL_PATH_RESOURCES = nil
ggml_metal_init: loading '/Users/<redacted>/ggml/build/bin/ggml-metal.metal'
ggml_metal_init: GPU name:   Intel(R) Iris(TM) Plus Graphics 655
ggml_metal_init: hasUnifiedMemory              = true
ggml_metal_init: recommendedMaxWorkingSetSize  =  1610.61 MB
ggml_metal_init: maxTransferRate               = built-in GPU
gpt2_model_load: memory size =   144.00 MB, n_mem = 24576
gpt2_model_load: model size  =   239.08 MB
extract_tests_from_file : No test file found.
test_gpt_tokenizer : 0 tests failed out of 0 tests.
main: compute buffer size: 6.46 MB
main: prompt: 'This is an example'
main: number of tokens in prompt = 4, first 8 tokens: 1212 318 281 1672 

 and related)],, assignment 2013][ ] 2011 assignment]. ] ][nyder ][]] ]RANTterRANTDCter]:RANTode Postedode ].hell"]hell ]hellSBwoodeodeaskingmarthell ]]:batwowoodehell"] ][odeaskaskingCmdhellode].],],ode],woodewoTF ],woaskingRMwowo ‎][]TFodeode ‎Terwo ]ray ];ification"]woodeter ]RANT ]; ][].RMRMwowotask]:RM ‎ ]]. ];wowohellode][].odebat ]bat ], RomneywoRANT ];martray‎>>>> ‎hellode ][RM][].":-odeodeodewohellCmdtaskwoode']wotaskwoRMRM ‎wohell ‎RMRMasksRMtaskhell] ‎ ‎ Posted ‎ ];rayRModeaskingaskingraytaskwo ‎]:wowo":-":-taskaskaskitywo]}raytask ‎ ‎] ][ ]] Posted Posted ‎GTateral ‎gd ][


main:     n_decoded =      199
main:     load time =   343.22 ms
main:   sample time =    83.01 ms
main:  predict time =  3197.49 ms
main:    total time =  3732.56 ms
ggml_metal_free: deallocating

CPU inference works fine.

Was just playing with ggml and thought that opening this issue made sense. Nothing important. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant