Skip to content

Commit

Permalink
Usage files and links to components and guides
Browse files Browse the repository at this point in the history
  • Loading branch information
brandenchan committed Sep 15, 2021
1 parent 28137bc commit 96f090a
Show file tree
Hide file tree
Showing 18 changed files with 39 additions and 71 deletions.
2 changes: 1 addition & 1 deletion docs/latest/components/document_store.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -116,7 +116,7 @@ See API documentation for more info.

DocumentStores expect Documents in dictionary form, like that below.
They are loaded using the `DocumentStore.write_documents()` method.
See [Preprocessing](/usage/preprocessing) for more information on the cleaning and splitting steps that will help you maximize Haystack's performance.
See [Preprocessing](/components/preprocessing) for more information on the cleaning and splitting steps that will help you maximize Haystack's performance.

[//]: # "Add link to preprocessing section"

Expand Down
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -222,7 +222,7 @@ NewNode = CustomNode()

You can add decision nodes where only one "branch" is executed afterwards.
This allows, for example, to classify an incoming query and depending on the result routing it to different modules.
To find a ready-made example of a decision node, have a look at [the page](/usage/query-classifier) about the `QueryClassifier`.
To find a ready-made example of a decision node, have a look at [the page](/components/query-classifier) about the `QueryClassifier`.

![image](https://user-images.githubusercontent.com/1563902/102452199-41229b80-403a-11eb-9365-7038697e7c3e.png)

Expand Down Expand Up @@ -260,7 +260,7 @@ To get hands on with this kind of node, have a look at the [evaluation tutorial]

## Ready-Made Pipelines

Last but not least, we added some ready-made pipelines that allow you to run standard patterns with very few lines of code. See the [ready-made pipelines page](/usage/ready-made-pipelines) and [pipelines API documentation](/reference/pipelines) to learn more about these.
Last but not least, we added some ready-made pipelines that allow you to run standard patterns with very few lines of code. See the [ready-made pipelines page](/components/ready-made-pipelines) and [pipelines API documentation](/reference/pipelines) to learn more about these.

**Examples:**

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -114,7 +114,7 @@ it is recommended that they are further processed in order to ensure optimal Ret
The `PreProcessor` takes one of the documents created by the converter as input,
performs various cleaning steps and splits them into multiple smaller documents.

For suggestions on how best to split your documents, see [Optimization](/usage/optimization)
For suggestions on how best to split your documents, see [Optimization](/guides/optimization)

```python
from haystack.preprocessor import PreProcessor
Expand Down
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ In Haystack, there are 2 pipeline configurations that are already encapsulated i
- `QuestionGenerationPipeline`
- `QuestionAnswerGenerationPipeline`

Have a look at our [ready-made pipelines page](/usage/ready-made-pipelines) to learn more about them.
Have a look at our [ready-made pipelines page](/components/ready-made-pipelines) to learn more about them.
Check out the question generation [tutorial](/tutorials/question-generation) to start using them.

## Use Case: Auto-Suggested Questions
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Ranker

There are pure "semantic document search" use cases that do not need question answering functionality but only document ranking.
While the [Retriever](/usage/retriever) is a perfect fit for document retrieval, we can further improve its results with the Ranker.
While the [Retriever](/components/retriever) is a perfect fit for document retrieval, we can further improve its results with the Ranker.
For example, BM25 (sparse retriever) does not take into account semantics of the documents and the query but only their keywords.
The Ranker can re-rank the results of the retriever step by taking semantics into account.
Similar to the Reader, it is based on the latest language models.
Expand Down
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,8 @@ Haystack Pipelines chain together various Haystack components to build a search
## ExtractiveQAPipeline

Extractive QA is the task of searching through a large collection of documents for a span of text that answers a question. The `ExtractiveQAPipeline` combines the Retriever and the Reader such that:
- The [Retriever](/usage/retriever) combs through a database and returns only the documents that it deems to be the most relevant to the query.
- The [Reader](/usage/reader) accepts the documents returned by the Retriever and selects a text span as the answer to the query.
- The [Retriever](/components/retriever) combs through a database and returns only the documents that it deems to be the most relevant to the query.
- The [Reader](/components/reader) accepts the documents returned by the Retriever and selects a text span as the answer to the query.

Note that outside of a pipeline, Readers and Retrievers accept a `top_k` parameter, which specifies the number of documents that they should return per query. However, inside the pipeline, these parameters are renamed to `top_k_retriever` and `top_k_reader`.

Expand Down Expand Up @@ -42,7 +42,7 @@ For more examples that showcase `ExtractiveQAPipeline`, check out one of our tut

We typically pass the output of the Retriever to another component such as the Reader or the Generator. However, we can use the Retriever by itself for semantic document search to find the documents most relevant to our query.

`DocumentSearchPipeline` wraps the [Retriever](/usage/retriever) into a pipeline. Note that this wrapper does not endow the Retrievers with additional functionality but instead allows them to be used consistently with other Haystack Pipeline objects and with the same familiar syntax. Creating this pipeline is as simple as passing the Retriever into the pipeline’s constructor:
`DocumentSearchPipeline` wraps the [Retriever](/components/retriever) into a pipeline. Note that this wrapper does not endow the Retrievers with additional functionality but instead allows them to be used consistently with other Haystack Pipeline objects and with the same familiar syntax. Creating this pipeline is as simple as passing the Retriever into the pipeline’s constructor:

```python
pipeline = DocumentSearchPipeline(retriever=retriever)
Expand All @@ -65,7 +65,7 @@ result["documents"]

Unlike extractive QA, which produces an answer by extracting a text span from a collection of passages, generative QA works by producing free text answers that need not correspond to a span of any document. Because the answers are not constrained by text spans, the Generator is able to create answers that are more appropriately worded compared to those extracted by the Reader. Therefore, it makes sense to employ a generative QA system if you expect answers to extend over multiple text spans, or if you expect answers to not be contained verbatim in the documents.

`GenerativeQAPipeline` combines the [Retriever](/usage/retriever) with the [Generator](/usage/generator). To create an answer, the Generator uses the internal factual knowledge stored in the language model’s parameters in addition to the external knowledge provided by the Retriever’s output.
`GenerativeQAPipeline` combines the [Retriever](/components/retriever) with the [Generator](/components/generator). To create an answer, the Generator uses the internal factual knowledge stored in the language model’s parameters in addition to the external knowledge provided by the Retriever’s output.

You can build a `GenerativeQAPipeline` by simply placing the individual components inside the pipeline’s constructor:

Expand All @@ -90,7 +90,7 @@ For more examples on using `GenerativeQAPipeline`, check out our tutorials where

Summarizer helps make sense of the Retriever’s output by creating a summary of the retrieved documents. This is useful for performing a quick sanity check and confirming the quality of candidate documents suggested by the Retriever, without having to inspect each document individually.

`SearchSummarizationPipeline` combines the [Retriever](/usage/retriever) with the [Summarizer](/usage/summarizer). Below is an example of an implementation.
`SearchSummarizationPipeline` combines the [Retriever](/components/retriever) with the [Summarizer](/components/summarizer). Below is an example of an implementation.

```python
pipeline = SearchSummarizationPipeline(summarizer=summarizer, retriever=retriever)
Expand All @@ -111,7 +111,7 @@ result['documents']

Translator components bring the power of machine translation into your QA systems. Say your knowledge base is in English but the majority of your user base speaks German. With a `TranslationWrapperPipeline`, you can chain together:

- The [Translator](/usage/translator), which translates a query source into a target language (e.g. German into English)
- The [Translator](/components/translator), which translates a query source into a target language (e.g. German into English)
- A search pipeline such as ExtractiveQAPipeline or DocumentSearchPipeline, which executes the translated query against a knowledge base.
- Another Translator that translates the search pipeline's results from the target back into the source language (e.g. English into German)

Expand Down Expand Up @@ -140,7 +140,7 @@ result["answers"]

## FAQPipeline

FAQPipeline wraps the [Retriever](/usage/retriever) into a pipeline and allows it to be used for question answering with FAQ data. Compared to other types of question answering, FAQ-style QA is significantly faster. However, it’s only able to answer FAQ-type questions because this type of QA matches queries against questions that already exist in your FAQ documents.
FAQPipeline wraps the [Retriever](/components/retriever) into a pipeline and allows it to be used for question answering with FAQ data. Compared to other types of question answering, FAQ-style QA is significantly faster. However, it’s only able to answer FAQ-type questions because this type of QA matches queries against questions that already exist in your FAQ documents.

For this task, we recommend using the Embedding Retriever with a sentence similarity model such as `sentence-transformers/all-MiniLM-L6-v2` Here’s an example of an FAQPipeline in action:

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ Here are the combinations which are supported:
| Embedding | Y | Y | N | Y | Y |
| DPR | Y | Y | N | Y | Y |

See [Optimization](/usage/optimization) for suggestions on how to choose top-k values.
See [Optimization](/guides/optimization) for suggestions on how to choose top-k values.

<div style={{ marginBottom: "3rem" }} />

Expand Down
File renamed without changes.
File renamed without changes.
2 changes: 1 addition & 1 deletion docs/latest/guides/chatbots.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ When it receives the result, it will base its next message to the user on the co

## Setting up Haystack with REST API

To set up a Haystack instance with REST API, have a look at [this documentation page](/usage/rest-api).
To set up a Haystack instance with REST API, have a look at [this documentation page](/guides/rest-api).
By default, the API server runs on `http:https://127.0.0.1:8000`.

<div className="max-w-xl bg-yellow-light-theme border-l-8 border-yellow-dark-theme px-6 pt-6 pb-4 my-4 rounded-md dark:bg-yellow-900">
Expand Down
4 changes: 2 additions & 2 deletions docs/latest/guides/optimization.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ retrieved_docs = retriever.retrieve(top_k=10)

## Metadata Filtering

Metadata can be attached to the documents which you index into your DocumentStore (see the input data format [here](/usage/retriever)).
Metadata can be attached to the documents which you index into your DocumentStore (see the input data format [here](/components/retriever)).
At query time, you can apply filters based on this metadata to limit the scope of your search and ensure your answers
come from a specific slice of your data.

Expand All @@ -96,7 +96,7 @@ This can reduce the work load of the retriever and also ensure that you get more

Filters are applied via the `filters` argument of the `Retriever` class. In practice, this argument will probably
be passed into the `Pipeline.run()` call, which will then route it on to the `Retriever` class
(see our the Arguments on the [Pipelines page](/usage/pipelines) for an explanation).
(see our the Arguments on the [Pipelines page](/components/pipelines) for an explanation).

```python
pipeline.run(
Expand Down
4 changes: 2 additions & 2 deletions docs/latest/guides/rest_api.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -10,13 +10,13 @@ The diagram below illustrates how the Haystack REST API is structured:

## Background: Haystack Pipelines

The Haystack [Pipeline](/usage/pipelines) is at the core of Haystack’s QA functionality, whether Haystack is used directly through the Python bindings or through a REST API.
The Haystack [Pipeline](/components/pipelines) is at the core of Haystack’s QA functionality, whether Haystack is used directly through the Python bindings or through a REST API.

A pipeline is defined as a sequence of components where each component performs a dedicated function, e.g., retrieving documents from a document store or extracting an answer to a query from a text document. A pipeline’s components are interconnected through inputs and outputs.

The Haystack REST API exposes an HTTP interface for interacting with a pipeline. For instance, you can use the REST API to send a query submitted in the body of an HTTP request to the Haystack Pipeline. Haystack will then process the request and return the answer to the query in a HTTP response.

When running Haystack as a REST API, you'll need to define the pipeline you'll be using in the API as a YAML file. Check out the [Pipelines YAML doc](/usage/pipelines) for more details.
When running Haystack as a REST API, you'll need to define the pipeline you'll be using in the API as a YAML file. Check out the [Pipelines YAML doc](/components/pipelines) for more details.

The example Haystack pipeline that we’ll be using below is defined in the [rest_api/pipeline/pipelines.yaml](https://github.com/deepset-ai/haystack/blob/master/rest_api/pipeline/pipelines.yaml) file.

Expand Down
70 changes: 19 additions & 51 deletions docs/latest/menu.json
Original file line number Diff line number Diff line change
Expand Up @@ -18,62 +18,30 @@
"subMenuTitle": "Components",
"pathPrefix": "/components/",
"items": [
{
"slug": "document-store",
"title": "DocumentStore"
}
]
{"slug": "preprocessing", "title": "Preprocessing"},
{"slug": "pipelines", "title": "Pipelines"},
{"slug": "ready-made-pipelines", "title": "Ready-Made Pipelines"},
{"slug": "document-store", "title": "DocumentStore"},
{"slug": "retriever", "title": "Retriever"},
{"slug": "reader", "title": "Reader"},
{"slug": "generator", "title": "Generator" },
{"slug": "summarizer", "title": "Summarizer"},
{"slug": "translator", "title": "Translator"},
{"slug": "knowledge-graph", "title": "Knowledge Graph"},
{"slug": "ranker", "title": "Ranker"},
{"slug": "query-classifier", "title": "Query Classifier"},
{"slug": "question-generator", "title": "Question Generator"} ]
},
{
"subMenuTitle": "Guides",
"pathPrefix": "/guides/",
"items": [
{
"slug": "preprocessing",
"title": "Preprocessing"
},
{ "slug": "pipelines", "title": "Pipelines" },
{ "slug": "ready-made-pipelines", "title": "Ready-Made Pipelines" },
{
"slug": "document-store",
"title": "DocumentStore"
},
{ "slug": "retriever", "title": "Retriever" },
{ "slug": "reader", "title": "Reader" },
{ "slug": "generator", "title": "Generator" },
{
"slug": "summarizer",
"title": "Summarizer"
},
{
"slug": "translator",
"title": "Translator"
},
{
"slug": "knowledge-graph",
"title": "Knowledge Graph"
},
{
"slug": "languages",
"title": "Languages Other Than English"
},
{
"slug": "domain-adaptation",
"title": "Domain Adaptation"
},
{
"slug": "optimization",
"title": "Optimization"
},
{
"slug": "annotation",
"title": "Annotation Tool"
},
{ "slug": "ranker", "title": "Ranker" },
{ "slug": "query-classifier", "title": "Query Classifier" },
{ "slug": "rest-api", "title": "REST API" },
{ "slug": "chatbots", "title": "Chatbot Integration" },
{ "slug": "question-generator", "title": "Question Generator" }
{"slug": "languages", "title": "Languages Other Than English"},
{"slug": "domain-adaptation","title": "Domain Adaptation"},
{"slug": "optimization", "title": "Optimization"},
{"slug": "annotation", "title": "Annotation Tool"},
{"slug": "rest-api", "title": "REST API"},
{"slug": "chatbots", "title": "Chatbot Integration"}
]
},
{
Expand Down
2 changes: 1 addition & 1 deletion docs/latest/overview/get_started.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -196,4 +196,4 @@ They include:
- Translators

These can all be combined in the configuration that you want.
Have a look at our [Pipelines page](/usage/pipelines) to see what's possible!
Have a look at our [Pipelines page](/components/pipelines) to see what's possible!

0 comments on commit 96f090a

Please sign in to comment.