Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve PromptNode integration with other nodes in Pipelines #3877

Closed
tstadel opened this issue Jan 17, 2023 · 1 comment · Fixed by #4378
Closed

Improve PromptNode integration with other nodes in Pipelines #3877

tstadel opened this issue Jan 17, 2023 · 1 comment · Fixed by #4378
Assignees

Comments

@tstadel
Copy link
Member

tstadel commented Jan 17, 2023

Is your feature request related to a problem? Please describe.
PromptNode works differently then all other nodes with respect to:

  • it has its own invocation context that is completely isolated from other node_inputs and node_outputs
  • it works in batch_mode per default (executes one prompt per document)

That makes it difficult for users to replace an existing node by PromptNode. This also makes it harder to experiment with different pipeline setups. Haystack users currently need to learn how to use PromptNode instead of just connecting it like any other node.

Describe the solution you'd like
PromptNode should be usable as simple as any other node. Especially for existing use-cases like question answering it is too hard to use PromptNode. See this example:

# Adapter to make PromptNode usable as any other Generator or Reader node
# - sets query into `PromptNode`'s invocation context
# - concatenates all retrieved documents to form one context document such that PromptNode finds the answer from all documents.
# This could be easily replaced by a Shaper once its available
class QAPromptInputAdapter(BaseComponent):
  outgoing_edges = 1
  def run(self, query: str, documents: List[Document]):
    return {"meta": {"invocation_context": { "questions": [query]}}, "documents": [Document("\n\n".join([f"Document {doc.id}:\n{doc.content}" for doc in documents]))]}, "output_1"
  def run_batch():
    pass

# This could be easily replaced by a Shaper once its available
class QAPromptOutputAdapter(BaseComponent):
  outgoing_edges = 1
  def run(self, results: List[str]):
    return {"answers": [Answer(answer=result, type="generative") for result in results]}, "output_1"
  def run_batch():
    pass

node = PromptNode(default_prompt_template="question-answering")
input_adapter = QAPromptInputAdapter()
output_adapter = QAPromptOutputAdapter()
pipe = Pipeline()
pipe.add_node(component=retriever, name="retriever", inputs=["Query"])
pipe.add_node(component=input_adapter, name="input_adapter", inputs=["retriever"])
pipe.add_node(component=node, name="prompt_node", inputs=["input_adapter"])
pipe.add_node(component=output_adapter, name="output_adapter", inputs=["prompt_node"])

output = pipe.run(query="Who are the parents of Arya Stark?")
output["answers"]

Instead it should be something like that:

node = QAPromptNode(default_prompt_template="question-answering")
pipe = Pipeline()
pipe.add_node(component=retriever, name="retriever", inputs=["Query"])
pipe.add_node(component=node, name="prompt_node", inputs=["retriever"])

output = pipe.run(query="Who are the parents of Arya Stark?")
output["answers"]

Here QAPromptNode could be easily replaced by any Reader or Generator.

So I suggest to:

  • introduce composite node QAPromptNode which is an aggregate consisting of a PromptNode with one input and one output Shaper under the hood
  • make use of the default query and queries params like any other node
  • concatenate documents such that PromptNode doesn't run in batch_mode, e.g. like already implemented for OpenAIAnswerGenerator in:
    def _build_prompt(self, query: str, documents: List[Document]) -> Tuple[str, List[Document]]:
  • convert results to answers like any other Reader or Generator
  • optionally: use the same input and output context as any other nodes. output_variable can still set the key to enable multiple PromptNode results (see Results of intermediate PromptNodes in a Multi-PromptNode-Pipelines are burried #3878).

Describe alternatives you've considered

  • Stay with PromptNode and Shaper only, but I guess especially for experimenting, supporting the main use-cases like Generative QA, should be as easy as with any other node to give users the opportunity to replace existing nodes with PromptNode.
  • Another idea would be to add input_shaper and output_shaper params to PromptNode. These could be set by default for certain default_prompt_template values, e.g. set the corresponding shapers for question_answering. Problem here is, that PromptNode would act completely differently depending on which template you use.
  • Another idea would be to add the functionality of shaping input and output to PromptNode itself. This would even be easier to handle for users, as there are less components to know (only PromptNode, no QAPromptNode, no Shaper). It would enable the shaping to make use of PromptNode's variables, like e.g. the model tokenizer, which improves/eases shaping (e.g. for concat_docs where there is a token limit for the prompt that only PromptNode can know). But this would also mean that most of the functionality of Shaper needs to be implemented within PromptNode which makes it pretty complex in addition to the drawbacks of the previous alternative.

Additional context
We should not miss the great flexibility that PromptNode offers, of course. But I guess some clever decisions could improve user experience here a lot.

@tstadel
Copy link
Member Author

tstadel commented Jan 18, 2023

@vblagoje I made some changes:

  • added QAPromptOutputAdapter to the code example as this was missing from mimicing an ordinary QA pipeline
  • better included Shaper into the picture
  • changed the suggestion to introduce a composite node class QAPromptNode instead of changing PromptNode itself

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants