Skip to content

Commit

Permalink
Merge branch 'add_docs_pages' of https://github.com/deepset-ai/haysta…
Browse files Browse the repository at this point in the history
…ck-website into add_docs_pages
  • Loading branch information
PiffPaffM committed Sep 1, 2021
2 parents 1930485 + 41f3170 commit e4377d0
Showing 1 changed file with 122 additions and 2 deletions.
124 changes: 122 additions & 2 deletions docs/latest/usage/ready_made_pipelines.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -30,10 +30,130 @@ result["answers"]
>>> [{ 'answer': 'Fang',
'score': 13.26807975769043,
'probability': 0.9657130837440491,
'context': Криволапик (Kryvolapyk, kryvi lapy "crooked paws")
'context': """Криволапик (Kryvolapyk, kryvi lapy "crooked paws")
===Fang (Hagrid's dog)===
*Chinese (PRC): 牙牙 (ya2 ya) (from"tooth", 牙,
*Chinese (PRC): 牙牙 (ya2 ya) (from 牙 "tooth", 牙,"""
...}]
```

For more examples that showcase ExtractiveQAPipeline, check out one of our tutorials [here](/tutorials/first-qa-system) or [here](/tutorials/without-elasticsearch).

## DocumentSearchPipeline

We typically pass the output of the Retriever to another component such as the Reader or the Generator. However, we can use the Retriever by itself for semantic document search to find the documents most relevant to our query.

DocumentSearchPipeline wraps the [Retriever](/usage/retriever) into a pipeline. Note that this wrapper does not endow the Retrievers with additional functionality but instead allows them to be used consistently with other Haystack Pipeline objects and with the same familiar syntax. Creating this pipeline is as simple as passing the Retriever into the pipeline’s constructor:

```python
pipeline = DocumentSearchPipeline(retriever=retriever)

query = "Tell me something about that time when they play chess."
result = pipeline.run(query, top_k_retriever=2)
```

The pipeline returns a Python dictionary with the retrieved documents accessible via the `documents` key:

```python
result["documents"]
>>> [{'text': "used Seamus Finnigan's pieces, which offered him conflicting advice because they knew that he was not a good or experienced player. Murphy McNully and Barnaby LeeDuring the Christmas holidays in the 1984-1985 school year, Rowan Khanna got a new Wizard's Chess set from their parents...",
'score': 24.10395,
'probability': 0.9531577010470198,
...}]
```

## GenerativeQAPipeline

Unlike extractive QA, which produces an answer by extracting a text span from a collection of passages, generative QA works by producing free text answers that need not correspond to a span of any document. Because the answers are not constrained by text spans, the Generator is able to create answers that are more appropriately worded compared to those extracted by the Reader. Therefore, it makes sense to employ a generative QA system if you expect answers to extend over multiple text spans, or if you expect answers to not be contained verbatim in the documents.

GenerativeQAPipeline combines the [Retriever](/usage/retriever) with the [Generator](/usage/generator). To create an answer, the Generator uses the internal factual knowledge stored in the language model’s parameters in addition to the external knowledge provided by the Retriever’s output.

You can build a GenerativeQAPipeline by simply placing the individual components inside the pipeline’s constructor:

```python
pipeline = GenerativeQAPipeline(generator=generator, retriever=retriever)

result = pipeline.run(query="Who opened the Chamber of Secrets?", top_k_retriever=10, top_k_generator=1)
```

You can access the answer via the `answers` key:

```python
result["answers"]
>>> [{'query': 'Who opened the Chamber of Secrets?',
'answer': ' tom riddle',
...}]
```

For more examples on using GenerativeQAPipeline, check out our tutorials where we implement generative QA systems with [RAG](/tutorials/retrieval-augmented-generation ) and [LFQA](/tutorials/lfqa).

## SearchSummarizationPipeline

Summarizer helps make sense of the Retriever’s output by creating a summary of the retrieved documents. This is useful for performing a quick sanity check and confirming the quality of candidate documents suggested by the Retriever, without having to inspect each document individually.

SearchSummarizationPipeline combines the [Retriever](/usage/retriever) with the [Summarizer](/usage/summarizer). Below is an example of an implementation.

```python
pipeline = SearchSummarizationPipeline(summarizer=summarizer, retriever=retriever)

result = pipeline.run(query="Describe Luna Lovegood.", top_k_retriever=10, generate_single_summary=True)
```


You can access the output via the `documents` key. Depending on whether you set the `generate_single_summary` to `True` or `False`, the output will either be a single summary of all documents or one summary per document.

```python
result['documents']
>>> [{'text': "Luna Lovegood is the only known member of the Lovegood family whose first name is not of Greek origin, rather it is of Latin origin. Her nickname, 'Loony,' refers to the moon and its ties with insanity, as it is short for 'lunatic' she is the goddess of the moon, hunting, the wilderness and the gift of taming wild animals.",
...}]
```

## TranslationWrapperPipeline

Translator components bring the power of machine translation into your QA systems. Say your knowledge base is in English but the majority of your user base speaks German. With a TranslationWrapperPipeline, you can chain together:

- The [Translator](/usage/translator), which translates a query source into a target language (e.g. German into English)
- A search pipeline such as ExtractiveQAPipeline or DocumentSearchPipeline, which executes the translated query against a knowledge base.
- Another Translator that translates the search pipeline's results from the target back into the source language (e.g. English into German)

After wrapping your search pipeline between two translation nodes, you can query it like you normally would, that is, by calling the `run()` method with a query in the desired language. Here’s an example of an implementation:

```python
pipeline = TranslationWrapperPipeline(input_translator=de_en_translator,
output_translator=en_de_translator,
pipeline=extractive_qa_pipeline)

query = "Was lässt den dreiköpfigen Hund weiterschlafen?" # What keeps the three-headed dog asleep?

result = pipeline.run(query=query, top_k_retriever=10, top_k_reader=1)
```

You may access the answer and other information like the model’s confidence and original context via the `answers` key, in this manner:

```python
result["answers"]
>>> [{'answer': 'der Klang der Musik',
'score': 9.269367218017578,
'probability': 0.4444255232810974,
'context': "test weakness was the inability to resist falling asleep to the sound of music,"
...}]
```

## FAQPipeline

FAQPipeline wraps the [Retriever](/usage/retriever) into a pipeline and allows it to be used for question answering with FAQ data. Compared to other types of question answering, FAQ-style QA is significantly faster. However, it’s only able to answer FAQ-type questions because this type of QA matches queries against questions that already exist in your FAQ documents.

For this task, we recommend using the Embedding Retriever with a sentence similarity model such as `deepset/sentence_bert.` Here’s an example of an FAQPipeline in action:

```python
pipeline = FAQPipeline(retriever=retriever)
query = "How to reduce stigma around Covid-19?"
result = pipeline.run(query=query, top_k_retriever=1)
```

```python
result["answers"]
>>> [{'answer': 'People can fight stigma and help, not hurt, others by providing social support. Counter stigma by learning and sharing facts. Communicating the facts that viruses do not target specific racial or ethnic groups and how COVID-19 actually spreads can help stop stigma.,
...}]
```

Check out our [tutorial](/tutorials/existing-faqs) for more information on FAQPipeline.

0 comments on commit e4377d0

Please sign in to comment.