Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Results of intermediate PromptNodes in a Multi-PromptNode-Pipelines are burried #3878

Closed
1 task done
tstadel opened this issue Jan 17, 2023 · 3 comments · Fixed by #3892
Closed
1 task done

Results of intermediate PromptNodes in a Multi-PromptNode-Pipelines are burried #3878

tstadel opened this issue Jan 17, 2023 · 3 comments · Fixed by #3892
Assignees

Comments

@tstadel
Copy link
Member

tstadel commented Jan 17, 2023

Describe the bug
Currently it's not possible to access easily the output of multiple PromptNodes that are chained together.
E.g. the code from the docs using a "question-generation" and a "question-answering" PromptNode in sequence only returns the answers as results, but the questions (which are a intermediate result) are burried inside PromptNode's invocation_context:

from haystack.nodes.prompt import PromptTemplate, PromptNode, PromptModel

# This is to set up the OpenAI model:
from getpass import getpass

api_key_prompt = "Enter OpenAI API key:" 
api_key = getpass(api_key_prompt)

# Specify the model you want to use:
prompt_open_ai = PromptModel(model_name_or_path="text-davinci-003", api_key=api_key)

# This sets up the default model:
prompt_model = PromptModel()

# Now let make one PromptNode use the default model and the other one the OpenAI model:
node_default_model = PromptNode(prompt_model, default_prompt_template="question-generation", output_variable="questions")
node_openai = PromptNode(prompt_open_ai, default_prompt_template="question-answering")

pipeline = Pipeline()
pipeline.add_node(component=node_default_model, name="prompt_node1", inputs=["Query"])
pipe.add_node(component=node_openai, name="prompt_node_2", inputs=["prompt_node1"])
output = pipe.run(query="not relevant", documents=[Document("Berlin is the capital of Germany")])
output["results"]

would produce something like this.

["Berlin"]

while the question to the answer is burried under output["meta"]["invocation_context"]["questions"]

Expected behavior
questions and answers can be easily accessed together, e.g. by exposing PromptNode's output_variable on the root level of node_output.

FAQ Check

@vblagoje
Copy link
Member

We'll do this @tstadel thanks for bringing this request to our attention

@masci
Copy link
Contributor

masci commented Jan 25, 2023

Since this issue was opened we made some changes to the PromptNode api, and I would argue that the intermediate output is not buried anymore. This is the current output from the pipeline run:

{
'results': ['Berlin.'],
 'invocation_context': {'questions': ['What is the capital of Germany?']},
 'questions': ['What is the capital of Germany?'],
 'root_node': 'Query',
 'params': {},
 'query': 'not relevant',
 'documents': [<Document: {'content': 'Berlin is the capital of Germany', 'content_type': 'text', 'score': None, 'meta': {}, 'embedding': None, 'id': '51b1f05adecc6e656d68af93cc40bd9c'}>],
 'node_id': 'prompt_node_2'
}

The PR linked would produce this instead:

{
'results': ['Berlin'],
 'invocation_context': {'questions': ['What is the capital of Germany?']},
 'questions': ['What is the capital of Germany?'],
 'root_node': 'Query',
 'params': {},
 'query': 'not relevant',
 'documents': [<Document: {'content': 'Berlin is the capital of Germany', 'content_type': 'text', 'score': None, 'meta': {}, 'embedding': None, 'id': '51b1f05adecc6e656d68af93cc40bd9c'}>],
 'node_id': 'prompt_node_2'
}

In my opinion having output["questions"] instead of output["invocation_context"]["questions"] at the price of the keys duplication is not worth it - if anything, to me the current version is more clear, it tells me that questions were somehow an intermediate product of the pipeline run and not the final result.

@tstadel if you still think we should move the key up to the root @vblagoje PR looks good to me, otherwise I would leave the code as is.

@tstadel
Copy link
Member Author

tstadel commented Jan 25, 2023

I would disagree on the point that the current version is more clear. Especially when reading the docstrings of PromptNode's output_variable I would expect it to be on the root level. Additionally all other nodes also accumulate their results (e.g. retrievers in a QA pipeline also store documents on the root level which is an intermediate result too). Of course the override results of previously executed same node types. But PromptNode being a Multi-purpose node is different here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants