Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sync : llama.cpp (CUDA ReLU, CPU-only with CUDA, bloom fix, etc) #607

Merged
merged 1 commit into from
Nov 13, 2023

Conversation

ggerganov
Copy link
Owner

@ggerganov ggerganov commented Nov 13, 2023

Also reverts the CUDA memory pull stuff, which was synced last time, but was later reverted in llama.cpp

copilot:all

@FSSRepo
Copy link
Collaborator

FSSRepo commented Nov 13, 2023

@slaren I think ggml_init_cublas() should be removed from ggml_init(...) by now. I was creating a model converter for stable-diffusion.cpp. The only thing I need is the gguf module, but it requires me to create a ggml_context to add tensors to the gguf_context. However, when it compiles with GGML_CUBLAS=ON, ggml_init calls ggml_init_cublas unnecessarily, leading to the error mentioned in this comment.

@ggerganov
Copy link
Owner Author

With this change, you can now run with CUDA_VISIBLE_DEVICES=-1 and ggml_init_cublas() should do nothing in this case

@slaren
Copy link
Collaborator

slaren commented Nov 13, 2023

Once we move llama.cpp to ggml-backend, we will be able to remove the ggml-cuda calls from ggml.c and make the backend more self-contained, but as it is now it is required to support the "old" way to use ggml-cuda. But as @ggerganov says it, it should be possible to use a CUDA build for CPU-only after ggerganov/llama.cpp#3946

@ggerganov ggerganov merged commit 844dbb8 into master Nov 13, 2023
10 checks passed
@ggerganov ggerganov deleted the sync branch November 13, 2023 14:54
CCLDArjun pushed a commit to CCLDArjun/ggml that referenced this pull request Dec 18, 2023
* It seems some new warning were added recently that exposed this.  I wrote the code that included this unused variable originally and it is indeed not needed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants