Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add way to compile ggml with intel mkl #804

Closed
wants to merge 1 commit into from

Conversation

kannon92
Copy link

I have been playing around with ollama and llama.cpp. I have an intel laptop and I wanted to use blas.

llama.cpp supports intel mkl and it seems to perform very well. I wanted to see if I can get this library to also compile with MKL.

I think I got it working.

I have a gnu build where test-vec1 runs in about 10 seconds while the intel + mkl runs in about 2 seconds.

@Green-Sky
Copy link
Contributor

Green-Sky commented Apr 23, 2024

The cmake BLAS find module is generic and supports different libraries, like mkl.
See llama.cpp's CMakeLists.txt: https://github.com/ggerganov/llama.cpp/blob/4e96a812b3ce7322a29a3008db2ed73d9087b176/CMakeLists.txt#L297-L377

edit: use can use the vendor option to select "Intel"

elseif (${LLAMA_BLAS_VENDOR} MATCHES "Intel")

@kannon92
Copy link
Author

Yea I use that code for llama.cpp but when I was looking at this repo, I noticed that blas was not set up in the same way.

Maybe I am confused because I don't exactly follow the relationship between llama.cpp and ggml.

In the CMakeFile, there was explict support for each blas option in this repo but in llama.cpp there is a smarter way to detect it.

I wasn't sure if refactoring to make it similar as ggml is necessary. I tried to keep the logic similar as what is present in this repo.

@slaren
Copy link
Collaborator

slaren commented Apr 23, 2024

Once upon a time, ggml consisted of two files, ggml.c and ggml.h. At the time it didn't make much sense to have a separate build script for ggml, so each project baked their own build scripts with ggml.c and ggml.h as additional sources. Now ggml has grown a lot, there are multiple backends that require intricate build processes, but the build model has not changed. As a result, the evolution of the build scripts happens mostly in the project repositories, mainly in llama.cpp which is the most active one, and occasionally some of the changes are ported to the ggml repository, but usually only the minimum effort to keep things working. Needless to say this model does not scale anymore, and we should look to decouple the build script of ggml from the build of the derived projects.

@kannon92
Copy link
Author

Is the eventual goal that ggml is used as a tensor library for whisper and llama? It wasn't clear to me if this code is used in llama.cpp. I see that syncing is done manually but I didn't see any linking or compilation of this library.

@slaren
Copy link
Collaborator

slaren commented Apr 23, 2024

ggml is already used as the tensor library of whisper.cpp, llama.cpp and other projects. The changes to ggml usually happen in llama.cpp, but they regularly synced back to this repository. However each project has their own build scripts, there isn't an unified way to build ggml, and that's a problem.

@zhouwg
Copy link
Contributor

zhouwg commented Apr 27, 2024

ggml is already used as the tensor library of whisper.cpp, llama.cpp and other projects. The changes to ggml usually happen in llama.cpp, but they regularly synced back to this repository. However each project has their own build scripts, there isn't an unified way to build ggml, and that's a problem.

already done in project kantv:a build script for llama.cpp,whisper.cpp,stablediffusion.cpp, but this script only works for Android

@kannon92 kannon92 closed this Nov 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants