feat(dspy): TensorRT LLM Integration #1096

Anindyadeep · 2024-06-02T18:48:05Z

Solves issue #1094

This PR adds a clean integration of TensorRT LLM with dspy. Docs are also being added.

okhat · 2024-06-04T11:49:41Z

Hey @Anindyadeep , this is SO cool. In DSPy we tend to like client-server interfaces for LLMs but there's a lot of demand for inference inside the script.

Anindyadeep · 2024-06-04T12:14:17Z

Hey @Anindyadeep , this is SO cool. In DSPy we tend to like client-server interfaces for LLMs but there's a lot of demand for inference inside the script.

I see, makes sense, so let me know how should we proceed with this, thanks

Anindyadeep · 2024-06-07T02:52:16Z

Hey @okhat let me know your feedbacks, would love to iterate on them

dsp/modules/tensorrt_llm.py

arnavsinghvi11 · 2024-06-17T13:49:05Z

dsp/modules/tensorrt_llm.py

+        self.runner, self._runner_kwargs = load_tensorrt_model(engine_dir=self.engine_dir, **engine_kwargs)
+        self.history = []
+
+    def _generate(self, prompt: Union[list[dict[str, str]], str], **kwargs) -> list[str]:


update return type for outputted response, kwargs dictionary.
Ideally, self._generate should only return the response and all_kwargs can be updated separately. This can be done within the basic_request method as done for other LM providers

I did some changes, not sure if I understood it right but let me know if this works. Thanks :)

dsp/modules/tensorrt_llm.py

arnavsinghvi11 · 2024-06-17T13:53:21Z

Hi @Anindyadeep , thanks for this addition! left a few comments and should be ready to merge following that.

docs/api/local_language_model_clients/TensorRTLLM.md

Merge from main

Anindyadeep · 2024-06-17T18:36:06Z

A quick question, any way to fix/bypass pre-commits for lm.py because it is giving me errors for things I have not added or changed.

arnavsinghvi11 · 2024-06-17T21:09:47Z

Hi @Anindyadeep , thanks for the changes. Running ruff check . --fix-only will correct the failing test.

(not sure why it fails untouched code but sometimes modifying the file can raise linter errors, which are good to fix anyways en route with this PR).

Anindyadeep · 2024-06-18T01:59:38Z

ruff check . --fix-only

Unfortunately not working, here are the logs:

dspy-py3.9vscode ➜ /workspaces/dspy (anindya/trtllm) $ ruff check . --fix-only               
warning: The top-level linter settings are deprecated in favour of their counterparts in the `lint` section. Please update the following options in `pyproject.toml`:
  - 'extend-unsafe-fixes' -> 'lint.extend-unsafe-fixes'
warning: Docstring at dspy/teleprompt/mipro_optimizer.py:113:9 contains implicit string concatenation; ignoring...
warning: Docstring at dspy/teleprompt/mipro_optimizer.py:124:9 contains implicit string concatenation; ignoring...
dspy-py3.9vscode ➜ /workspaces/dspy (anindya/trtllm) $ git add dsp/modules/lm.py          
dspy-py3.9vscode ➜ /workspaces/dspy (anindya/trtllm) $ git commit -m "fix(dspy): run ruff"
[WARNING] Unstaged files detected.
[INFO] Stashing unstaged files to /home/vscode/.cache/pre-commit/patch1718675924-4390.
ruff.....................................................................Failed
- hook id: ruff
- exit code: 1

dsp/modules/lm.py:22:9: ANN201 Missing return type annotation for public function `basic_request`
dsp/modules/lm.py:25:9: ANN201 Missing return type annotation for public function `request`
dsp/modules/lm.py:28:9: ANN201 Missing return type annotation for public function `print_green`
dsp/modules/lm.py:31:9: ANN201 Missing return type annotation for public function `print_red`
dsp/modules/lm.py:34:9: ANN201 Missing return type annotation for public function `inspect_history`
dsp/modules/lm.py:116:9: ANN201 Missing return type annotation for public function `copy`
Found 6 errors.

ruff-format..............................................................Passed
isort....................................................................Passed
check yaml...........................................(no files to check)Skipped
fix end of files.........................................................Passed
trim trailing whitespace.................................................Passed
check docstring is first.................................................Passed
check toml...........................................(no files to check)Skipped
check for added large files..............................................Passed
fix requirements.txt.................................(no files to check)Skipped
check for merge conflicts................................................Passed
debug statements (python)................................................Passed
pretty format json...................................(no files to check)Skipped
prettier.................................................................Passed

change for ruff fix

arnavsinghvi11 · 2024-06-18T14:51:03Z

no worries, just resolved it on my end and pushed the changes.
Thanks again @Anindyadeep!

Anindyadeep · 2024-06-18T15:02:17Z

no worries, just resolved it on my end and pushed the changes. Thanks again @Anindyadeep!

Thanks @arnavsinghvi11, upnext Triton Inference Server

Anindyadeep added 3 commits June 2, 2024 18:41

feat(dspy): Nvidia TensorRT LLM integration

b75dd73

refactor(dspy): add tensorrt llm in __init__

27a74c8

docs(dspy): Create TensorRTLLM.md

ddde99a

Anindyadeep changed the title ~~TensorRT LLM Integration~~ feat(dspy): TensorRT LLM Integration Jun 3, 2024

arnavsinghvi11 requested changes Jun 17, 2024

View reviewed changes

Update TensorRTLLM.md

950bcb4

arnavsinghvi11 reviewed Jun 17, 2024

View reviewed changes

docs/api/local_language_model_clients/TensorRTLLM.md Outdated Show resolved Hide resolved

Anindyadeep added 8 commits June 17, 2024 23:19

Merge pull request #7 from stanfordnlp/main

36424ba

Merge from main

Merge branch 'main' into anindya/trtllm

27da8c9

Added version in transformer install info

f24cf8e

fix(dspy): provide type check for history

c8726b9

fix(dspy): Add TensorRTModel in __init__ and change import info in docs

478f6a5

refactor(dspy): Added tensorrt_llm in history inspect

5fc1337

fix(dspy): Add additional comma btwn cloudflare and google

ffbc0cd

fix(dspy): fix return type in _generate

6f9c1f3

Anindyadeep requested a review from arnavsinghvi11 June 17, 2024 18:35

Update lm.py

b3353b6

change for ruff fix

arnavsinghvi11 merged commit 09993d2 into stanfordnlp:main Jun 18, 2024
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(dspy): TensorRT LLM Integration #1096

feat(dspy): TensorRT LLM Integration #1096

Anindyadeep commented Jun 2, 2024

okhat commented Jun 4, 2024

Anindyadeep commented Jun 4, 2024 •

edited

Loading

Anindyadeep commented Jun 7, 2024

arnavsinghvi11 Jun 17, 2024

Anindyadeep Jun 17, 2024

arnavsinghvi11 commented Jun 17, 2024

Anindyadeep commented Jun 17, 2024

arnavsinghvi11 commented Jun 17, 2024

Anindyadeep commented Jun 18, 2024

arnavsinghvi11 commented Jun 18, 2024

Anindyadeep commented Jun 18, 2024

feat(dspy): TensorRT LLM Integration #1096

feat(dspy): TensorRT LLM Integration #1096

Conversation

Anindyadeep commented Jun 2, 2024

okhat commented Jun 4, 2024

Anindyadeep commented Jun 4, 2024 • edited Loading

Anindyadeep commented Jun 7, 2024

arnavsinghvi11 Jun 17, 2024

Choose a reason for hiding this comment

Anindyadeep Jun 17, 2024

Choose a reason for hiding this comment

arnavsinghvi11 commented Jun 17, 2024

Anindyadeep commented Jun 17, 2024

arnavsinghvi11 commented Jun 17, 2024

Anindyadeep commented Jun 18, 2024

arnavsinghvi11 commented Jun 18, 2024

Anindyadeep commented Jun 18, 2024

Anindyadeep commented Jun 4, 2024 •

edited

Loading