Bug: Embedding endpoint takes exponential time to process a long unknown token #8029
Labels
bug
Something isn't working
good first issue
Good for newcomers
medium severity
Used to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable)
What happened?
I am feeding the server's embeddings endpoint with a long sequence of "a" characters (a single unknown token) and it takes the server an exponentially growing amount of time to respond.
The embeddings model downloaded from: https://huggingface.co/nomic-ai/nomic-embed-text-v1.5-GGUF
By the load pattern (CPU only at ~10%) I guess the problem is with the tokenizer. GPU is idle.
Test script:
Name and Version
b3187 (2075a66)
What operating system are you seeing the problem on?
Windows
Relevant log output
The text was updated successfully, but these errors were encountered: