-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to set a max input length? #1132
Comments
I don't know one, but the documentation has some tips on handling this error: https://dspy-docs.vercel.app/docs/faqs#errors |
Hi @hellwigt-eq , there isn't a clean way to set a max length on the final prompt sent to the LLM, but for this use case in TGI, you can add a validation check before the request is sent. this could be added via a PR with exploration on keeping flags for rejection, truncation+retry, etc. |
For anyone else who encounters this, I found that |
I'm using a LLM being served by Hugging Face's TGI. TGI enforces a max-input-tokens setting, which is typically set to the context length of the model. Any requests to the endpoint which exceed this limit are rejected. This rejection causes DSPy to throw an exception and exit as seen here:
Is there a way to limit the max input length of all queries that get sent to the LLM, such that they never exceed a set context length?
The text was updated successfully, but these errors were encountered: