use the correct SYCL context for host USM allocations #7777

bashbaug · 2024-06-05T19:17:02Z

Hello, I found an issue testing SYCL with the OpenCL backend: the SYCL host USM allocations are happening against a different queue (and hence a different context) than the other allocations, which is undefined behavior according to the SYCL spec:

Each USM allocation has an associated SYCL context, and any access to that memory must use the same context. Specifically, any SYCL kernel function that dereferences a pointer to a USM allocation must be submitted to a queue that was constructed with the same context that was used to allocate that memory. The explicit memory operation commands that take USM pointers have a similar restriction. (See Section 4.9.4.3 for details.) Violations of these requirements result in undefined behavior.

The specific error I am seeing is:

Native API failed. Native API returns: -5 (PI_ERROR_OUT_OF_RESOURCES) -5 (PI_ERROR_OUT_OF_RESOURCES)
Exception caught at file:/home/bashbaug/git/llama.cpp/ggml-sycl.cpp, line:17172, func:operator()
SYCL error: CHECK_TRY_ERROR(g_syclStreams[sycl_ctx->device][0]->memcpy( data, (const char *)tensor->data + offset, size).wait()): Meet error in this line code!
  in function ggml_backend_sycl_get_tensor_async at /home/bashbaug/git/llama.cpp/ggml-sycl.cpp:17172
GGML_ASSERT: /home/bashbaug/git/llama.cpp/ggml-sycl.cpp:3069: !"SYCL error"
Could not attach to process.  If your uid matches the uid of the target
process, check the setting of /proc/sys/kernel/yama/ptrace_scope, or try
again as the root user.  For more details, see /etc/sysctl.d/10-ptrace.conf
ptrace: Operation not permitted.
No stack.
The program is not being run.
Aborted (core dumped)

The changes in this PR are intended to use the same queue for USM host allocations as for other allocations, which works for both the OpenCL backend and the Level Zero backend (tested with an Intel integrated GPU and an Intel discrete GPU).

Specifically, I tested with:

$ ONEAPI_DEVICE_SELECTOR=opencl:* ./bin/main -m ~/Downloads/models/llama-2-7b.Q4_0.gguf -p "Building a website can be done in 10 simple steps:" -n 400 -e -ngl 33 -sm none -mg 0 -s 999
$ ONEAPI_DEVICE_SELECTOR=ext_oneapi_level_zero:* ./bin/main -m ~/Downloads/models/llama-2-7b.Q4_0.gguf -p "Building a website can be done in 10 simple steps:" -n 400 -e -ngl 33 -sm none -mg 0 -s 999

Both backends generated the same results.

Signed-off-by: Ben Ashbaugh <[email protected]>

joeatodd · 2024-06-06T09:39:47Z

Hey @bashbaug thanks for submitting this. It looks good, just running some tests internally & will get back to you 👍

AidanBeltonS

Thanks for fixing this bug!

NeoZhangJianyu

It's great!
Thank you fix it!

This reverts commit af4ae50.

use the correct SYCL context for host USM allocations

83689a5

Signed-off-by: Ben Ashbaugh <[email protected]>

github-actions bot added the SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language label Jun 5, 2024

AidanBeltonS approved these changes Jun 6, 2024

View reviewed changes

NeoZhangJianyu approved these changes Jun 7, 2024

View reviewed changes

AidanBeltonS merged commit af4ae50 into ggerganov:master Jun 10, 2024
2 of 6 checks passed

AidanBeltonS added a commit that referenced this pull request Jun 10, 2024

Revert "use the correct SYCL context for host USM allocations (#7777)"

4e4ff76

This reverts commit af4ae50.

AidanBeltonS mentioned this pull request Jun 10, 2024

[SYCL] Revert "use the correct SYCL context for host USM allocations" #7858

Closed

bashbaug mentioned this pull request Jun 12, 2024

sycl: always set the main device after initialization #7909

Closed

4 tasks

airMeng mentioned this pull request Jun 14, 2024

[SYCL] remove global variables #7710

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use the correct SYCL context for host USM allocations #7777

use the correct SYCL context for host USM allocations #7777

bashbaug commented Jun 5, 2024

joeatodd commented Jun 6, 2024

AidanBeltonS left a comment

NeoZhangJianyu left a comment

use the correct SYCL context for host USM allocations #7777

use the correct SYCL context for host USM allocations #7777

Conversation

bashbaug commented Jun 5, 2024

joeatodd commented Jun 6, 2024

AidanBeltonS left a comment

Choose a reason for hiding this comment

NeoZhangJianyu left a comment

Choose a reason for hiding this comment