Skip to content
This repository has been archived by the owner on Oct 20, 2022. It is now read-only.

Docs V0.10 #164

Merged
merged 7 commits into from
Sep 23, 2021
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Document most similar documents pipeline
  • Loading branch information
brandenchan committed Sep 20, 2021
commit a21bec7f580a7448d798b70dafc8179a0b6f7d1e
33 changes: 31 additions & 2 deletions docs/latest/components/ready_made_pipelines.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ We typically pass the output of the Retriever to another component such as the R

`DocumentSearchPipeline` wraps the [Retriever](/components/retriever) into a pipeline. Note that this wrapper does not endow the Retrievers with additional functionality but instead allows them to be used consistently with other Haystack Pipeline objects and with the same familiar syntax. Creating this pipeline is as simple as passing the Retriever into the pipeline’s constructor:

```python
``` python
pipeline = DocumentSearchPipeline(retriever=retriever)

query = "Tell me something about that time when they play chess."
Expand Down Expand Up @@ -128,7 +128,7 @@ result = pipeline.run(query=query, params={"retriever": {"top_k": 10}, "reader":

You may access the answer and other information like the model’s confidence and original context via the `answers` key, in this manner:

```python
``` python
result["answers"]
>>> [{'answer': 'der Klang der Musik',
'score': 9.269367218017578,
Expand Down Expand Up @@ -209,4 +209,33 @@ Output:
],
...
}
```

## MostSimilarDocumentsPipeline

This pipeline is used to find the most similar documents to a given document in your document store.

You will need to first make sure that your indexed documents have attached embeddings.
You can generate and store their embeddings using the `DocumentStore.update_embeddings()` method.

``` python
from haystack.pipeline import MostSimilarDocumentsPipeline

msd_pipeline = MostSimilarDocumentsPipeline(document_store)
result = msd_pipeline.run(document_ids=[doc_id1, doc_id2, ...])
print(result)
```

Output:

``` python
[[
{'text': "Southern California's economy is diver...",
'score': 0.8605178832348279,
'question': None,
'meta': {'name': 'Southern_California'},
'embedding': ...,
'id': '6e26b1b78c48efc6dd6c888e72d0970b'},
...
]]
```