Run llama model on PC.
based on ggerganov/llama.cpp, add Flask server and UI.
Guideline zh
- source: https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF/tree/main
- baidu disk: https://pan.baidu.com/s/1YvAYrDD6DfoxpwD2kT5n3w?pwd=1234
docker build -t songgs/llm-cpu -f deploy/Dockerfile .
docker run -it -p 8000:8000 songgs/llm-cpu