CUDA: fix matrix multiplication logic for tests #6667

JohannesGaessler · 2024-04-13T21:30:56Z

This PR fixes tests/test-backend-ops on Pascal. The issue is that the tests pass an instance of src1 with GGML_TYPE_F16. This does not happen during actual inference and was therefore not considered for the logic for selecting a matrix multiplication kernel. For Pascal cards other than the P100 there is no fast evaluation code available either. But since this is a case that does not appear during actual inference it's fine to use the slow code for the tests only. Ceveat: (I think) the tests still fail on Maxwell but honestly I don't think it would be worth the effort to investigate and fix.

CUDA: fix matrix multiplication logic for tests

c6797da

JohannesGaessler mentioned this pull request Apr 13, 2024

Fix cuda mul mat for pascal cc==610 #6636

Closed

slaren approved these changes Apr 13, 2024

View reviewed changes

JohannesGaessler merged commit b5e7285 into ggerganov:master Apr 13, 2024
56 of 59 checks passed

tybalex pushed a commit to rubra-ai/tools.cpp that referenced this pull request Apr 17, 2024

CUDA: fix matrix multiplication logic for tests (ggerganov#6667)

4f22851

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUDA: fix matrix multiplication logic for tests #6667

CUDA: fix matrix multiplication logic for tests #6667

JohannesGaessler commented Apr 13, 2024

CUDA: fix matrix multiplication logic for tests #6667

CUDA: fix matrix multiplication logic for tests #6667

Conversation

JohannesGaessler commented Apr 13, 2024