Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(dspy): TensorRT LLM Integration #1096

Merged
merged 13 commits into from
Jun 18, 2024

Conversation

Anindyadeep
Copy link
Contributor

Solves issue #1094

This PR adds a clean integration of TensorRT LLM with dspy. Docs are also being added.

@Anindyadeep Anindyadeep changed the title TensorRT LLM Integration feat(dspy): TensorRT LLM Integration Jun 3, 2024
@okhat
Copy link
Collaborator

okhat commented Jun 4, 2024

Hey @Anindyadeep , this is SO cool. In DSPy we tend to like client-server interfaces for LLMs but there's a lot of demand for inference inside the script.

@Anindyadeep
Copy link
Contributor Author

Anindyadeep commented Jun 4, 2024

Hey @Anindyadeep , this is SO cool. In DSPy we tend to like client-server interfaces for LLMs but there's a lot of demand for inference inside the script.

I see, makes sense, so let me know how should we proceed with this, thanks

@Anindyadeep
Copy link
Contributor Author

Hey @okhat let me know your feedbacks, would love to iterate on them

dsp/modules/tensorrt_llm.py Show resolved Hide resolved
dsp/modules/tensorrt_llm.py Outdated Show resolved Hide resolved
dsp/modules/tensorrt_llm.py Outdated Show resolved Hide resolved
self.runner, self._runner_kwargs = load_tensorrt_model(engine_dir=self.engine_dir, **engine_kwargs)
self.history = []

def _generate(self, prompt: Union[list[dict[str, str]], str], **kwargs) -> list[str]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update return type for outputted response, kwargs dictionary.
Ideally, self._generate should only return the response and all_kwargs can be updated separately. This can be done within the basic_request method as done for other LM providers

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some changes, not sure if I understood it right but let me know if this works. Thanks :)

dsp/modules/tensorrt_llm.py Show resolved Hide resolved
@arnavsinghvi11
Copy link
Collaborator

Hi @Anindyadeep , thanks for this addition! left a few comments and should be ready to merge following that.

@Anindyadeep
Copy link
Contributor Author

A quick question, any way to fix/bypass pre-commits for lm.py because it is giving me errors for things I have not added or changed.

@arnavsinghvi11
Copy link
Collaborator

Hi @Anindyadeep , thanks for the changes. Running ruff check . --fix-only will correct the failing test.

(not sure why it fails untouched code but sometimes modifying the file can raise linter errors, which are good to fix anyways en route with this PR).

@Anindyadeep
Copy link
Contributor Author

ruff check . --fix-only

Unfortunately not working, here are the logs:

dspy-py3.9vscode ➜ /workspaces/dspy (anindya/trtllm) $ ruff check . --fix-only               
warning: The top-level linter settings are deprecated in favour of their counterparts in the `lint` section. Please update the following options in `pyproject.toml`:
  - 'extend-unsafe-fixes' -> 'lint.extend-unsafe-fixes'
warning: Docstring at dspy/teleprompt/mipro_optimizer.py:113:9 contains implicit string concatenation; ignoring...
warning: Docstring at dspy/teleprompt/mipro_optimizer.py:124:9 contains implicit string concatenation; ignoring...
dspy-py3.9vscode ➜ /workspaces/dspy (anindya/trtllm) $ git add dsp/modules/lm.py          
dspy-py3.9vscode ➜ /workspaces/dspy (anindya/trtllm) $ git commit -m "fix(dspy): run ruff"
[WARNING] Unstaged files detected.
[INFO] Stashing unstaged files to /home/vscode/.cache/pre-commit/patch1718675924-4390.
ruff.....................................................................Failed
- hook id: ruff
- exit code: 1

dsp/modules/lm.py:22:9: ANN201 Missing return type annotation for public function `basic_request`
dsp/modules/lm.py:25:9: ANN201 Missing return type annotation for public function `request`
dsp/modules/lm.py:28:9: ANN201 Missing return type annotation for public function `print_green`
dsp/modules/lm.py:31:9: ANN201 Missing return type annotation for public function `print_red`
dsp/modules/lm.py:34:9: ANN201 Missing return type annotation for public function `inspect_history`
dsp/modules/lm.py:116:9: ANN201 Missing return type annotation for public function `copy`
Found 6 errors.

ruff-format..............................................................Passed
isort....................................................................Passed
check yaml...........................................(no files to check)Skipped
fix end of files.........................................................Passed
trim trailing whitespace.................................................Passed
check docstring is first.................................................Passed
check toml...........................................(no files to check)Skipped
check for added large files..............................................Passed
fix requirements.txt.................................(no files to check)Skipped
check for merge conflicts................................................Passed
debug statements (python)................................................Passed
pretty format json...................................(no files to check)Skipped
prettier.................................................................Passed

change for ruff fix
@arnavsinghvi11
Copy link
Collaborator

no worries, just resolved it on my end and pushed the changes.
Thanks again @Anindyadeep!

@arnavsinghvi11 arnavsinghvi11 merged commit 09993d2 into stanfordnlp:main Jun 18, 2024
4 checks passed
@Anindyadeep
Copy link
Contributor Author

no worries, just resolved it on my end and pushed the changes. Thanks again @Anindyadeep!

Thanks @arnavsinghvi11, upnext Triton Inference Server

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants