add hipBLAS for windows #135

Cyberhan123 · 2023-09-29T04:52:29Z

Support hipBLAS #133

Cyberhan123 · 2023-09-29T04:54:24Z

Although it was compiled successfully, I saw that the model was not successfully offloaded to the GPU.

CMakeLists.txt

README.md

saharNooby · 2023-09-30T07:09:20Z

Although it was compiled successfully, I saw that the model was not successfully offloaded to the GPU.

That's why I'd like to request benchmark results. At the minimum, please provide per-token latencies on your machine for CPU-only and GPU-only modes -- GPU should be significantly lower, if the new backend works. You can use existing script measure_pexplexity.py for measuring.

Cyberhan123 · 2023-10-01T04:44:51Z

Although it was compiled successfully, I saw that the model was not successfully offloaded to the GPU.

That's why I'd like to request benchmark results. At the minimum, please provide per-token latencies on your machine for CPU-only and GPU-only modes -- GPU should be significantly lower, if the new backend works. You can use existing script measure_pexplexity.py for measuring.

ggml_init_cublas: found 1 ROCm devices:
Device 0: AMD Radeon RX 7900 XTX, compute capability 11.0
Loading text
Loading World v20230424 tokenizer
273 tokens in the text
Token #0/273, 0%, ETA 2 m 16 s
Token #27/273, 9%, ETA 1 m 48 s, averages so far: loss [3.631], perplexity 37.749
Token #54/273, 19%, ETA 1 m 36 s, averages so far: loss [3.381], perplexity 29.408
Token #81/273, 29%, ETA 1 m 24 s, averages so far: loss [3.031], perplexity 20.719
Token #108/273, 39%, ETA 1 m 12 s, averages so far: loss [2.510], perplexity 12.306
Token #135/273, 49%, ETA 1 m 0 s, averages so far: loss [2.351], perplexity 10.491
Token #162/273, 59%, ETA 0 m 48 s, averages so far: loss [2.150], perplexity 8.582
Token #189/273, 69%, ETA 0 m 36 s, averages so far: loss [2.031], perplexity 7.621
Token #216/273, 79%, ETA 0 m 24 s, averages so far: loss [1.881], perplexity 6.561
Token #243/273, 89%, ETA 0 m 12 s, averages so far: loss [1.878], perplexity 6.540
Token #270/273, 98%, ETA 0 m 0 s, averages so far: loss [1.856], perplexity 6.400

Model: RWKV-novel-4-World-7B-20230810-ctx128k-ggml-f16.bin, data: test.txt with 273 tokens, skipped 2 tokens, averages: loss [1.859], perplexity 6.419, latency 447 ms per token

Cyberhan123 · 2023-10-02T05:21:29Z

It was my mistake, I needed to manually offload the context onto the gpu, I just found out.

saharNooby · 2023-10-02T09:31:19Z

latency 447 ms per token

Is this result for CPU or GPU? In any case, a second number is needed for comparison.

Cyberhan123 · 2023-10-02T11:00:00Z

latency 447 ms per token

Is this result for CPU or GPU? In any case, a second number is needed for comparison.

This is a GPU test, but it is not offloaded to the GPU correctly. Now by setting rwkv_gpu_offload_layers, it will be offloaded to the GPU correctly. I will check and improve the benchmark test.

Cyberhan123 · 2023-10-05T05:31:46Z

@saharNooby I think this PR has been completed.

add hipBLAS for windows

b5f7a8d

saharNooby reviewed Sep 30, 2023

View reviewed changes

CMakeLists.txt Outdated Show resolved Hide resolved

saharNooby reviewed Sep 30, 2023

View reviewed changes

README.md Outdated Show resolved Hide resolved

fix test build

9a0611c

do test and fix load dll on windows

77ec5f4

Cyberhan123 added 2 commits October 2, 2023 18:48

add hipBLAS_on_Windows document

59842b5

update hipBLAS_on_Windows doc

5533afd

add benchmark

ef87933

saharNooby approved these changes Oct 6, 2023

View reviewed changes

saharNooby merged commit 22a2778 into RWKV:master Oct 6, 2023
12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add hipBLAS for windows #135

add hipBLAS for windows #135

Cyberhan123 commented Sep 29, 2023 •

edited

Loading

Cyberhan123 commented Sep 29, 2023

saharNooby commented Sep 30, 2023

Cyberhan123 commented Oct 1, 2023

Cyberhan123 commented Oct 2, 2023 •

edited

Loading

saharNooby commented Oct 2, 2023

Cyberhan123 commented Oct 2, 2023

Cyberhan123 commented Oct 5, 2023 •

edited

Loading

add hipBLAS for windows #135

add hipBLAS for windows #135

Conversation

Cyberhan123 commented Sep 29, 2023 • edited Loading

Cyberhan123 commented Sep 29, 2023

saharNooby commented Sep 30, 2023

Cyberhan123 commented Oct 1, 2023

Cyberhan123 commented Oct 2, 2023 • edited Loading

saharNooby commented Oct 2, 2023

Cyberhan123 commented Oct 2, 2023

Cyberhan123 commented Oct 5, 2023 • edited Loading

Cyberhan123 commented Sep 29, 2023 •

edited

Loading

Cyberhan123 commented Oct 2, 2023 •

edited

Loading

Cyberhan123 commented Oct 5, 2023 •

edited

Loading