-
-
Notifications
You must be signed in to change notification settings - Fork 648
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FIX] The chat fails when querying with /notes while running LLAMA3 API #831
Comments
When I add the OpenAi API (not the OpenAI-compatible LLAMA3 API), then it works properly. So the issue appears when the /notes mode is used together with an OpenAi-compatible API. |
Thanks for reporting @stevennt , I'll see if I can get a repro of the error on my end. Meanwhile, could you share any stack traces from your local server? |
Seems that it is something to do with the tokenizer: Configure tokenizer for unsupported model: llama3-70b-8192 in Khoj settings
` |
Awesome, this is really helpful, thanks for including it. Will see if I can get it resolved today. |
Thanks for the helpful report @stevennt ! I've pushed a fix. It'll be included in the next release (~1 week). If you want the fix sooner and you're using the docker image, you can build from the |
I just pulled the master branch which has the below commit fix. The issue still persists. @sabaimran |
How are you running the project? Straight from source? Or from the docker image? You're seeing the exact same stack trace? |
What I did was to pull the source, run docker-compose build, docker-compose down -v, and docker-compose up. Maybe something needs to be set at this tokenizer field? Below is the stack trace:
myaiabnkhoj-server-1 | [08:00:03.609363] INFO khoj.database.adapters: No default init.py:558 myaiabnkhoj-server-1 | [08:04:47.153646] DEBUG uvicorn.error: < TEXT 'What are the protocol.py:1172 ` |
This doesn't work: What are the number of articles in Wikipedia? This works: /general What are the number of articles in Wikipedia? |
Hmm interesting. You should run |
API: I'm not using Ollama. I'm using a vendor-hosted LLAMA3 API. With that API: Yes /general works faultlessly but the moment I say /notes or /default it breaks (regardless of whether or not I have uploaded any document). At the same time, OpenAI API works faultlessly with both /general and /notes. Yes I do run docker-compose build with the revised code. Other test with text changes do show up, meaning the build process was ok. |
Wrapping the API in LiteLLM: myaiabnkhoj-server-1 | [01:05:12.918899] DEBUG uvicorn.error: < TEXT 'How can I protocol.py:1172 Ok I tried to wrap that LLAMA3 API (via GROQ) behind LiteLLM as well to ensure it follows the OpenAi format. The error is the same. @sabaimran |
and here is the log from litellm backend: |
Could it be something with the streaming settings under the /notes mode? |
@sabaimran @debanjum Not sure if this is just for me or any crucial step I'm missing, but so far: Only OpenAPI endpoint works when talking to notes. All other OpenAI-compatible endpoints fail when talking to notes and only work in /general mode. I have tried:
So far, this part looks the most suspicious in the log: Btw, the backend has a myriad of configuration options (good job to Khoj for making it so configurable), but the backend documentation is not catching up with that :). Would be helpful to add them! |
@stevennt , sorry for the long gap in responding to this! I see why this is happening. It's because the OpenAI API Compatible server you're using (in this case, via LiteLLM or Groq) fails when it sees a parameter it doesn't recognize. Specifically, it seems to be getting triggered by I don't seem to be running into this error with any of the OpenAI API compatible servers we're using. For LiteLLM, I think you should be able to configure the setting for drop params to change the behavior of this failure. |
Hey @sabaimran excellent. It was indeed that drop_params that solved the problem. Khoj seems to work now with the LiteLLM & Groq combination. |
This is an initial pass to add documentation for all the knobs available on the Khoj Admin panel. It should shed some light onto what each admin setting is for and how they can be customized when self hosting. Resolves #831
This is an initial pass to add documentation for all the knobs available on the Khoj Admin panel. It should shed some light onto what each admin setting is for and how they can be customized when self hosting. Resolves #831
Describe the bug
Problem: When querying notes: The chat interface is not responding. Backend gives out a lot of error messages.
The setup:
This works:
/general How many articles are in Wikipedia?
This fails:
How many articles are in Wikipedia?
When it fails: whenever it needs to query the notes. So specifically telling it not to (/general), then it works.
To Reproduce
Steps to reproduce the behavior:
Screenshots
If applicable, add screenshots to help explain your problem.
Platform
If self-hosted
Docker from source, branch master at this commit: 3cfe5aa
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: