-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add llamaindex integration #1170
add llamaindex integration #1170
Conversation
Thank you Jerry. We should merge this, but we're failing on Ruff checks, so that's the only thing we gotta resolve. |
ruff fix
dspy/predict/llamaindex.py
Outdated
|
||
def _output_keys_from_template(template: dsp.Template) -> InputKeys: | ||
"""Get input keys from template.""" | ||
# get only fields that are marked OldInputField and NOT OldOutputField |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo with L53 and 54 - likely should be "output keys" and "OldOutputField"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops yes will get to this soon! (probably later in afternoon/tonight)
self.predict_modules = [] | ||
for module in query_pipeline.module_dict.values(): | ||
if isinstance(module, DSPyComponent): | ||
self.predict_modules.append(module.predict_module) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are these predict_modules used later on?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not explicitly, but isn't this needed to have the optimizer work? to explicitly register predict_modules as a field in the module
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since LlamaIndexModule inherits from dspy.Module
, these are already registered under predictors, named_predictors which are then used in the optimizers. I tested the notebook locally and it works without the predict_modules (since the codebase doesn't recognize it 😄 ). Let me know if that makes sense and feel free to test it out on your end to confirm the expected behavior without it!
ex: predict_modules (not being used but aligns with what's done in named_predictors) ->
[ChainOfThought(GenerateAnswer(context_str, query_str -> answer
instructions='Answer questions with short factoid answers.'
context_str = Field(annotation=str required=True json_schema_extra={'desc': 'contains relevant facts', '__dspy_field_type': 'input', 'prefix': 'Context Str:'})
query_str = Field(annotation=str required=True json_schema_extra={'__dspy_field_type': 'input', 'prefix': 'Query Str:', 'desc': '${query_str}'})
answer = Field(annotation=str required=True json_schema_extra={'desc': 'often between 1 and 5 words', '__dspy_field_type': 'output', 'prefix': 'Answer:'})
))]
named_predictors (used in optimizers - already registered with dspy.Module
)
<bound method Module.named_predictors of predict_modules[0] = ChainOfThought(GenerateAnswer(context_str, query_str -> answer
instructions='Answer questions with short factoid answers.'
context_str = Field(annotation=str required=True json_schema_extra={'desc': 'contains relevant facts', '__dspy_field_type': 'input', 'prefix': 'Context Str:'})
query_str = Field(annotation=str required=True json_schema_extra={'__dspy_field_type': 'input', 'prefix': 'Query Str:', 'desc': '${query_str}'})
answer = Field(annotation=str required=True json_schema_extra={'desc': 'often between 1 and 5 words', '__dspy_field_type': 'output', 'prefix': 'Answer:'})
))>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tbh i'm not sure how this works, since these are buried in the query_pipeline.module_dict - i took a brief skim at the named_parameters code, but if you're saying that these are autoregistered because it recursively traverses all subfields then sure i trust you :)
Thanks @jerryjliu for this PR! I patched the ruff fix directly. left a minor comment for a typo and also just wanted to confirm if the PR was ready to merge from your end since there were some in-line TODOs. will merge once you're ready! |
"outputs": [], | ||
"source": [ | ||
"from llama_index.core.query_pipeline import QueryPipeline as QP, InputComponent, FnComponent\n", | ||
"from llamaindex import DSPyComponent, LlamaIndexModule\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ran into import errors while running the notebook - should be dspy.predict.llamaindex
(or you can add llamaindex imports here and then call dspy.LlamaIndexModule directly!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops that's embarassing, yes need to migrate imports
"outputs": [], | ||
"source": [ | ||
"from llama_index.core.query_pipeline import QueryPipeline as QP, InputComponent, FnComponent\n", | ||
"from llamaindex import DSPyComponent, LlamaIndexModule\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just flagging from above
}, | ||
"outputs": [], | ||
"source": [ | ||
"from llamaindex import DSPyPromptTemplate\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just flagging from above
ah so with the updated imports, this is failing the tests since llama_index is not a supported dependency. The llama_index import statements need to be wrapped in a try/except block to avoid triggering the errors like done here or moving it within the various initialization methods where the imports are needed as done here. I can take a look at this soon to resolve this locally as well. [update - feel free to disregard above. llamaindex needs to be a required dependency to import all its functions but updating poetry.lock within this PR breaks local DSPy tests. integration for libraries to avoid this import patch will be added via comprehensive integration tests in upcoming PRs!] |
f576bbf
to
e6373de
Compare
Thanks again @jerryjliu ! Exciting integration! |
woot oops sorry just seeing this. @arnavsinghvi11 thanks for landing!!! (btw feel free to DM me your twitter/linkedin handles, would be happy to promote on socials when this goes out) |
add integrations + a full cookbook showing llamaindex + dspy: