Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

metal : wrap each operation in debug group #690

Merged
merged 1 commit into from
Jan 10, 2024

Conversation

jmousseau
Copy link
Contributor

The screenshot below shows a Metal debug capture of gpt-2-backend with the addition of debug groups.

ggml-metal-debug-group

Copy link
Owner

@ggerganov ggerganov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm this is very interesting and potentially useful! If you know more ways to improve the metal code for debugging and profiling purposes (e.g. using instruments) - please share.

P.S. This does not have effect on performance in Release builds - correct?

@ggerganov ggerganov merged commit 2f3b12f into ggerganov:master Jan 10, 2024
4 checks passed
@jmousseau
Copy link
Contributor Author

If you know more ways to improve the metal code for debugging and profiling purposes (e.g. using instruments) - please share.

There are some next steps I'd like to explore:

P.S. This does not have effect on performance in Release builds - correct?

As far as I know, performance should be unaffected in release builds. In my testing locally, I didn't see any changing in the timings. Would wrapping the capture and debug logic in GGML_METAL_NDEBUG be preferable?

@jmousseau jmousseau deleted the metal-debug-groups branch January 10, 2024 14:49
ggerganov added a commit that referenced this pull request Jan 11, 2024
@ggerganov
Copy link
Owner

In llama.cpp we've decided to guard the debug calls with GGML_METAL_NDEBUG:

ggerganov/llama.cpp@2a7c94d

Will sync the changes here soon

@ggerganov
Copy link
Owner

@jmousseau How do you create these Metal debug captures that you've shown in the screenshot? I'm not very familiar with Xcode - been trying to figure it, but no luck so far. Would appreciate if you can share some step-by-step instructions

@jmousseau
Copy link
Contributor Author

@ggerganov Here are the steps I use, starting with Xcode project generation.

cmake -DGGML_METAL=ON -DBUILD_SHARED_LIBS=Off -G Xcode ..
open ggml.xcodeproj

Select the gpt-2-backend scheme at the top, right of the git info.

xcode-select-scheme

Again click on the gpt-2-backend scheme, and choose Edit Scheme.... Configure the desired launch arguments and environment variables.

xcode-scheme-run-configuration

Traditionally, the easiest way to produce a Metal capture is with the "M" button above the debug console as shown below.

metal-capture-button

However, this will only capture GPU work enqueued after the button is pressed. For traditional graphics programs, this isn't a problem as the next rendered frame will be captured. In our case, the work will be queued (command buffers and encoders created) before you're able to initiate the capture.

Therefore, setting up the capture boundary programmatically necessary. In main-backend.cpp, you'll want to initiate a capture before ggml_backend_graph_compute is called (requires #694).

if (ggml_backend_is_metal(model.backend)) {
    ggml_backend_metal_capture_next_compute(model.backend);
}

Run the program by pressing the play button. Once the GPU work completes, Xcode will automatically open the Metal debugger.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants