Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use the correct SYCL context for host USM allocations #7777

Merged
merged 1 commit into from
Jun 10, 2024

Conversation

bashbaug
Copy link
Contributor

@bashbaug bashbaug commented Jun 5, 2024

Hello, I found an issue testing SYCL with the OpenCL backend: the SYCL host USM allocations are happening against a different queue (and hence a different context) than the other allocations, which is undefined behavior according to the SYCL spec:

Each USM allocation has an associated SYCL context, and any access to that memory must use the same context. Specifically, any SYCL kernel function that dereferences a pointer to a USM allocation must be submitted to a queue that was constructed with the same context that was used to allocate that memory. The explicit memory operation commands that take USM pointers have a similar restriction. (See Section 4.9.4.3 for details.) Violations of these requirements result in undefined behavior.

The specific error I am seeing is:

Native API failed. Native API returns: -5 (PI_ERROR_OUT_OF_RESOURCES) -5 (PI_ERROR_OUT_OF_RESOURCES)
Exception caught at file:/home/bashbaug/git/llama.cpp/ggml-sycl.cpp, line:17172, func:operator()
SYCL error: CHECK_TRY_ERROR(g_syclStreams[sycl_ctx->device][0]->memcpy( data, (const char *)tensor->data + offset, size).wait()): Meet error in this line code!
  in function ggml_backend_sycl_get_tensor_async at /home/bashbaug/git/llama.cpp/ggml-sycl.cpp:17172
GGML_ASSERT: /home/bashbaug/git/llama.cpp/ggml-sycl.cpp:3069: !"SYCL error"
Could not attach to process.  If your uid matches the uid of the target
process, check the setting of /proc/sys/kernel/yama/ptrace_scope, or try
again as the root user.  For more details, see /etc/sysctl.d/10-ptrace.conf
ptrace: Operation not permitted.
No stack.
The program is not being run.
Aborted (core dumped)

The changes in this PR are intended to use the same queue for USM host allocations as for other allocations, which works for both the OpenCL backend and the Level Zero backend (tested with an Intel integrated GPU and an Intel discrete GPU).

Specifically, I tested with:

$ ONEAPI_DEVICE_SELECTOR=opencl:* ./bin/main -m ~/Downloads/models/llama-2-7b.Q4_0.gguf -p "Building a website can be done in 10 simple steps:" -n 400 -e -ngl 33 -sm none -mg 0 -s 999
$ ONEAPI_DEVICE_SELECTOR=ext_oneapi_level_zero:* ./bin/main -m ~/Downloads/models/llama-2-7b.Q4_0.gguf -p "Building a website can be done in 10 simple steps:" -n 400 -e -ngl 33 -sm none -mg 0 -s 999

Both backends generated the same results.

@github-actions github-actions bot added the SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language label Jun 5, 2024
@joeatodd
Copy link
Collaborator

joeatodd commented Jun 6, 2024

Hey @bashbaug thanks for submitting this. It looks good, just running some tests internally & will get back to you 👍

Copy link
Collaborator

@AidanBeltonS AidanBeltonS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this bug!

Copy link
Collaborator

@NeoZhangJianyu NeoZhangJianyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's great!
Thank you fix it!

@AidanBeltonS AidanBeltonS merged commit af4ae50 into ggerganov:master Jun 10, 2024
2 of 6 checks passed
AidanBeltonS added a commit that referenced this pull request Jun 10, 2024
@airMeng airMeng mentioned this pull request Jun 14, 2024
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants