-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(dspy): TensorRT LLM Integration #1096
Conversation
Hey @Anindyadeep , this is SO cool. In DSPy we tend to like client-server interfaces for LLMs but there's a lot of demand for inference inside the script. |
I see, makes sense, so let me know how should we proceed with this, thanks |
Hey @okhat let me know your feedbacks, would love to iterate on them |
dsp/modules/tensorrt_llm.py
Outdated
self.runner, self._runner_kwargs = load_tensorrt_model(engine_dir=self.engine_dir, **engine_kwargs) | ||
self.history = [] | ||
|
||
def _generate(self, prompt: Union[list[dict[str, str]], str], **kwargs) -> list[str]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
update return type for outputted response, kwargs dictionary.
Ideally, self._generate should only return the response and all_kwargs can be updated separately. This can be done within the basic_request method as done for other LM providers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did some changes, not sure if I understood it right but let me know if this works. Thanks :)
Hi @Anindyadeep , thanks for this addition! left a few comments and should be ready to merge following that. |
Merge from main
A quick question, any way to fix/bypass pre-commits for |
Hi @Anindyadeep , thanks for the changes. Running (not sure why it fails untouched code but sometimes modifying the file can raise linter errors, which are good to fix anyways en route with this PR). |
Unfortunately not working, here are the logs: dspy-py3.9vscode ➜ /workspaces/dspy (anindya/trtllm) $ ruff check . --fix-only
warning: The top-level linter settings are deprecated in favour of their counterparts in the `lint` section. Please update the following options in `pyproject.toml`:
- 'extend-unsafe-fixes' -> 'lint.extend-unsafe-fixes'
warning: Docstring at dspy/teleprompt/mipro_optimizer.py:113:9 contains implicit string concatenation; ignoring...
warning: Docstring at dspy/teleprompt/mipro_optimizer.py:124:9 contains implicit string concatenation; ignoring...
dspy-py3.9vscode ➜ /workspaces/dspy (anindya/trtllm) $ git add dsp/modules/lm.py
dspy-py3.9vscode ➜ /workspaces/dspy (anindya/trtllm) $ git commit -m "fix(dspy): run ruff"
[WARNING] Unstaged files detected.
[INFO] Stashing unstaged files to /home/vscode/.cache/pre-commit/patch1718675924-4390.
ruff.....................................................................Failed
- hook id: ruff
- exit code: 1
dsp/modules/lm.py:22:9: ANN201 Missing return type annotation for public function `basic_request`
dsp/modules/lm.py:25:9: ANN201 Missing return type annotation for public function `request`
dsp/modules/lm.py:28:9: ANN201 Missing return type annotation for public function `print_green`
dsp/modules/lm.py:31:9: ANN201 Missing return type annotation for public function `print_red`
dsp/modules/lm.py:34:9: ANN201 Missing return type annotation for public function `inspect_history`
dsp/modules/lm.py:116:9: ANN201 Missing return type annotation for public function `copy`
Found 6 errors.
ruff-format..............................................................Passed
isort....................................................................Passed
check yaml...........................................(no files to check)Skipped
fix end of files.........................................................Passed
trim trailing whitespace.................................................Passed
check docstring is first.................................................Passed
check toml...........................................(no files to check)Skipped
check for added large files..............................................Passed
fix requirements.txt.................................(no files to check)Skipped
check for merge conflicts................................................Passed
debug statements (python)................................................Passed
pretty format json...................................(no files to check)Skipped
prettier.................................................................Passed |
change for ruff fix
no worries, just resolved it on my end and pushed the changes. |
Thanks @arnavsinghvi11, upnext Triton Inference Server |
Solves issue #1094
This PR adds a clean integration of TensorRT LLM with dspy. Docs are also being added.