Support GGUF #16

trufae · 2024-05-13T22:09:25Z

GGML is kind of not supported anymore and all models have moved to GGUF as a standard a year ago. Are there any plans to support it here? I'm wondering what are the limitations to handle sliding window in gguf compared to ggml if that's the problem

foldl · 2024-05-14T02:08:25Z

chatllm.cpp is not down-stream app of llama.cpp, but an app based on ggml just as llama.cpp. It supports some models that are not supported by llama.cpp, I won't wait for llama.cpp to support it and then port to chatllm.cpp. So, I need to maintain my own set of supported models.

Further more, since the implementation of some models is developed independently from llama.cpp, some tensors (k/v/q specifically) might use different formats/shapes, which makes them incompatible with each other.

Anyway, it seems possible to support GGUF for some models (e.g. LlaMA models). I will look into it later.

foldl added the gguf label May 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support GGUF #16

Support GGUF #16

trufae commented May 13, 2024

foldl commented May 14, 2024

Support GGUF #16

Support GGUF #16

Comments

trufae commented May 13, 2024

foldl commented May 14, 2024