Support Official llama.cpp docker/images. #7506

DirtyKnightForVi · 2024-05-24T01:28:10Z

Is there an official version of llama.cpp available in Docker now? I need to deploy it in a completely offline environment, and non-containerized deployment makes the installation of many compilation environments quite troublesome.

Plz

ngxson · 2024-05-24T07:12:09Z

We don't deploy it to Docker Hub, but we do have Github Registry: https://github.com/ggerganov/llama.cpp/pkgs/container/llama.cpp

DirtyKnightForVi · 2024-05-26T01:51:24Z

Thank you very much. However, I would like to enter the container and recompile another branch of the project. According to the current manual and practice, it seems not supported, as I have observed that the container fails to start consistently. Have I missed something?

(base) jiyin@jiyin:/media/jiyin/ResearchSpace$ sudo docker run -v /media/jiyin/ResearchSpace:/models 92ddd0cc4ed1 bash
Unknown command: bash
Available commands: 
  --run (-r): Run a model previously converted into ggml
              ex: -m /models/7B/ggml-model-q4_0.bin -p "Building a website can be done in 10 simple steps:" -n 512
  --convert (-c): Convert a llama model into ggml
              ex: --outtype f16 "/models/7B/" 
  --quantize (-q): Optimize with quantization process ggml
              ex: "/models/7B/ggml-model-f16.bin" "/models/7B/ggml-model-q4_0.bin" 2
  --finetune (-f): Run finetune command to create a lora finetune of the model
              See documentation for finetune for command-line parameters
  --all-in-one (-a): Execute --convert & --quantize
              ex: "/models/" 7B
  --server (-s): Run a model on the server
              ex: -m /models/7B/ggml-model-q4_0.bin -c 2048 -ngl 43 -mg 1 --port 8080

Galunid · 2024-05-26T03:42:02Z

You can try adding --entrypoint /bin/bash instead of bash if you want to get inside the container.

Just note that in general recompiling stuff inside container is not something that should be done. The changes won't persist. You can build multiple images and use docker tags to differentiate between them.

DirtyKnightForVi · 2024-05-26T06:32:36Z

What I actually want to do is to try the changes under this branch， which haven't been merged into the main branch yet. Deploying the project on a completely offline machine is indeed too troublesome.

Galunid · 2024-05-27T03:48:53Z

Then you need to build the image yourself, before you deploy it. See README.md for more information

DirtyKnightForVi · 2024-06-05T09:57:00Z

We don't deploy it to Docker Hub, but we do have Github Registry: https://github.com/ggerganov/llama.cpp/pkgs/container/llama.cpp

May I follow up with a question: Do the latest images on this list all include the content from the main branch? Does each suffix correspond to a specific version of the source code?

ngxson · 2024-06-05T10:35:24Z

@DirtyKnightForVi The image tag corresponds to:

Which Dockerfile in https://github.com/ggerganov/llama.cpp/tree/master/.devops
And, which commit hash is used to build the image

For example, docker pull ghcr.io/ggerganov/llama.cpp:full-cuda--b1-2b33896 is built from full-cuda.Dockerfile and commit 2b33896

DirtyKnightForVi · 2024-06-05T10:55:15Z

@DirtyKnightForVi The image tag corresponds to:

Which Dockerfile in https://github.com/ggerganov/llama.cpp/tree/master/.devops

And, which commit hash is used to build the image

For example, docker pull ghcr.io/ggerganov/llama.cpp:full-cuda--b1-2b33896 is built from full-cuda.Dockerfile and commit 2b33896

Thank you very much for your patient response.

DirtyKnightForVi · 2024-06-06T05:15:15Z

Although I successfully ran the model using the CUDA image, it seems that the model was loaded onto the GPU but inference is being performed on the CPU. Did I miss something?

docker run --gpus all -v /path/llama_test:/models ghcr.io/ggerganov/llama.cpp:full-cuda--b1-5442939 -m /models/xx.gguf -p "hello" --n-gpu-layers 18

DirtyKnightForVi added the enhancement New feature or request label May 24, 2024

Galunid closed this as completed May 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support Official llama.cpp docker/images. #7506

Support Official llama.cpp docker/images. #7506

DirtyKnightForVi commented May 24, 2024

ngxson commented May 24, 2024

DirtyKnightForVi commented May 26, 2024

Galunid commented May 26, 2024 •

edited

Loading

DirtyKnightForVi commented May 26, 2024

Galunid commented May 27, 2024

DirtyKnightForVi commented Jun 5, 2024

ngxson commented Jun 5, 2024 •

edited

Loading

DirtyKnightForVi commented Jun 5, 2024

DirtyKnightForVi commented Jun 6, 2024

Support Official llama.cpp docker/images. #7506

Support Official llama.cpp docker/images. #7506

Comments

DirtyKnightForVi commented May 24, 2024

ngxson commented May 24, 2024

DirtyKnightForVi commented May 26, 2024

Galunid commented May 26, 2024 • edited Loading

DirtyKnightForVi commented May 26, 2024

Galunid commented May 27, 2024

DirtyKnightForVi commented Jun 5, 2024

ngxson commented Jun 5, 2024 • edited Loading

DirtyKnightForVi commented Jun 5, 2024

DirtyKnightForVi commented Jun 6, 2024

Galunid commented May 26, 2024 •

edited

Loading

ngxson commented Jun 5, 2024 •

edited

Loading