ROCm on WSL #5275

justinkb · 2024-06-25T15:37:46Z

Recently, AMD released preview drivers for Windows that, alongside userspace packages for WSL, enable one to use ROCm through WSL. Ollama detection of AMD GPUs in linux, however, uses the presence of loaded amdgpu drivers and other sysfs stuff to determine various properties of the GPU. These are not available with this WSL ROCm setup, nor is rocm-smi used for querying VRAM size and its usage etc. I was wondering if it was feasible to add some detection for this setup, so it can be used anyway, even if some runtime information is not available. Is runtime knowledge of the available VRAM strictly necessary? Could a user just not make sure not to load too big of a model, and in case of failing to do so, accept that the ROCm runtime will hard error out on failing hipMallocs etc? Perhaps we could warn users in the output that this might happen.

jmorganca · 2024-06-25T15:39:06Z

cc @dhiltgen

dhiltgen · 2024-06-25T15:56:02Z

The installation docs seem to imply the amdgpu driver is installed. I'll have to set up a test system so I can poke around and see what discovery options we've got. @justinkb if you have already done so, can you check out the following paths on your system?

ls /sys/class/kfd/kfd/topology/nodes/*
cat /sys/class/kfd/kfd/topology/nodes/*/properties
ls /sys/class/drm/card*/device

justinkb · 2024-06-25T16:09:47Z

I didn't realize the kernel driver would be installed and loadable(?) on wsl, since I got some stable diffusion and llama stuff working with just user space without amdgpu loaded (I guess with this setup amdgpu driver might be some stub indirecting and translating calls to the actual windows driver). Not at my PC right now, so can't check the driver thing, but I will see what I can get working myself, too.

…

On Tue, Jun 25, 2024, 5:56 PM Daniel Hiltgen ***@***.***> wrote: The installation docs <https://rocm.docs.amd.com/projects/radeon/en/latest/docs/install/wsl/install-radeon.html> seem to imply the amdgpu driver is installed. I'll have to set up a test system so I can poke around and see what discovery options we've got. — Reply to this email directly, view it on GitHub <#5275 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AABVHKACYPFS7VX32SCK7EDZJGHKRAVCNFSM6AAAAABJ4DVQACVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOBZGMZTKNBQGM> . You are receiving this because you authored the thread.Message ID: ***@***.***>

justinkb · 2024-06-25T17:11:01Z

I just noticed those docs specify installing with "amdgpu-install -y --usecase=wsl,rocm --no-dkms" specifically, meaning the kernel driver source for dkms won't be installed. this isn't to say getting it installed and loaded isn't possible on wsl, but I doubt it will be, since I don't think any of the drm subsystem is actually available in wsl linux kernel. In any case, compiling the dkms driver isn't trivial, since I'll need to install the WSL2 kernel headers in a way dkms expects them. Again, even if and when I get that set up, I cannot imagine that'll actually work to load the actual amdgpu driver. I checked the sources, it is just the same amdgpu driver as used in actual linux, not some forwarding stub, so I don't see how it could work when there is no way the underlying vm for wsl2 can somehow paravirtualize the actual GPU while windows is also using it.

justinkb · 2024-06-27T20:40:48Z

The installation docs seem to imply the amdgpu driver is installed. I'll have to set up a test system so I can poke around and see what discovery options we've got. @justinkb if you have already done so, can you check out the following paths on your system?
ls /sys/class/kfd/kfd/topology/nodes/*
cat /sys/class/kfd/kfd/topology/nodes/*/properties
ls /sys/class/drm/card*/device

the devices in /sys/class/drm/card0 and /sys/class/drm/render128 just point to vgem the virtual GEM provider, /sys/class/kfd is completely absent. and as expected, I wasn't able to load amdgpu driver on wsl

justinkb · 2024-06-28T14:32:17Z

I managed to hack this into working order, see https://github.com/justinkb/ollama/tree/wsl-rocm-hack - can confirm it works perfectly like this. theoretically, I could write a windows program that updates the referenced text file that contains the used memory periodically, which would enable ollama to monitor the memory usage.

xaxaxa7b9 · 2024-07-10T16:49:54Z

@justinkb how do i install your solution?

justinkb added the feature request New feature or request label Jun 25, 2024

jmorganca added amd Issues relating to AMD GPUs and ROCm wsl Issues using WSL labels Jun 25, 2024

dhiltgen self-assigned this Jun 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ROCm on WSL #5275

ROCm on WSL #5275

justinkb commented Jun 25, 2024

jmorganca commented Jun 25, 2024

dhiltgen commented Jun 25, 2024 •

edited

Loading

justinkb commented Jun 25, 2024 via email

justinkb commented Jun 25, 2024 •

edited

Loading

justinkb commented Jun 27, 2024

justinkb commented Jun 28, 2024

xaxaxa7b9 commented Jul 10, 2024

ROCm on WSL #5275

ROCm on WSL #5275

Comments

justinkb commented Jun 25, 2024

jmorganca commented Jun 25, 2024

dhiltgen commented Jun 25, 2024 • edited Loading

justinkb commented Jun 25, 2024 via email

justinkb commented Jun 25, 2024 • edited Loading

justinkb commented Jun 27, 2024

justinkb commented Jun 28, 2024

xaxaxa7b9 commented Jul 10, 2024

dhiltgen commented Jun 25, 2024 •

edited

Loading

justinkb commented Jun 25, 2024 •

edited

Loading