Skip to content

Commit

Permalink
fixing links and toc scroll (deepset-ai#130)
Browse files Browse the repository at this point in the history
* fixing links and toc scroll

* limiting the menu on the right to hot fix

* limiting the menu on the right to hot fix
  • Loading branch information
PiffPaffM committed Aug 30, 2021
1 parent bb75a76 commit 46ceb1a
Show file tree
Hide file tree
Showing 13 changed files with 32 additions and 28 deletions.
4 changes: 4 additions & 0 deletions components/Toc.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -52,8 +52,10 @@ export default function Toc({ stars, editOnGitHubLink, toc }: TocProps) {
</a>
</Link>
)}
{toc && (
<ul className="border-l-4 pl-4">
{toc?.map((c) => (
c.level < 3 && (
<li
key={c.text}
className={`mb-3 text-gray-400 hover:text-gray-700 ${
Expand All @@ -62,8 +64,10 @@ export default function Toc({ stars, editOnGitHubLink, toc }: TocProps) {
>
<a href={`#${c.link}`}>{c.text}</a>
</li>
)
))}
</ul>
)}
</div>
);
}
2 changes: 1 addition & 1 deletion docs/v0.5.0/usage/document_store.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ See API documentation for more info.

DocumentStores expect Documents in dictionary form, like that below.
They are loaded using the `DocumentStore.write_documents()` method.
See [Preprocessing](/docs/v0.5.0/preprocessingmd) for more information on how to best prepare your data.
See [Preprocessing](/usage/v0.5.0/preprocessing) for more information on how to best prepare your data.

[//]: # "Add link to preprocessing section"

Expand Down
2 changes: 1 addition & 1 deletion docs/v0.5.0/usage/languages.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ The default model that is loaded in the DensePassageRetriever is for English.
We are currently working on training a German DensePassageRetriever model and know other teams who work on further languages.
If you have a language model and a question answering dataset in your own language, you can also train a DPR model using Haystack!
Below is a simplified example.
See the [API reference](/docs/v0.5.0/apiretrievermd#train) for `DensePassageRetriever.train()` for more details.
See the [API reference](/reference/v0.5.0/retriever#train) for `DensePassageRetriever.train()` for more details.

```python
dense_passage_retriever.train(self,
Expand Down
2 changes: 1 addition & 1 deletion docs/v0.5.0/usage/preprocessing.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ docs = [
There are a range of different file converters in Haystack that
can extract text from files and cast them into the unified dictionary format shown above.
Haystack features support for txt, pdf and docx files and there is even a converter that leverages Apache Tika.
Please refer to [the API docs](/docs/v0.5.0/file_convertersmd) to see which converter best suits you.
Please refer to [the API docs](/reference/v0.5.0/file-converters) to see which converter best suits you.

<Tabs
options={[
Expand Down
14 changes: 7 additions & 7 deletions docs/v0.9.0/overview/faq.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -32,9 +32,9 @@ You can reduce the work load on the Reader by instructing the Retriever to pass
This is done by setting the `top_k_retriever` parameter to a lower value.

Making sure that your documents are shorter can also increase the speed of your system. You can split
your documents into smaller chunks by using the `PreProcessor` (see [tutorial](https://haystack.deepset.ai/docs/v0.9.0/tutorial11md)).
your documents into smaller chunks by using the `PreProcessor` (see [tutorial](https://haystack.deepset.ai/tutorials/pipelines)).

For more optimization suggestions, have a look at our [optimization page](https://haystack.deepset.ai/docs/v0.9.0/optimizationmd)
For more optimization suggestions, have a look at our [optimization page](https://haystack.deepset.ai/usage/optimization)
and also our [blogs](https://medium.com/deepset-ai)

<div style={{ marginBottom: "3rem" }} />
Expand All @@ -43,16 +43,16 @@ and also our [blogs](https://medium.com/deepset-ai)

The components in Haystack, such as the `Retriever` or the `Reader`, are designed in a language agnostic way. However you may
have to set certain parameters or load models pretrained for your language in order to get good performance out of Haystack.
See our [languages page](https://haystack.deepset.ai/docs/v0.9.0/languagesmd) for more details.
See our [languages page](https://haystack.deepset.ai/usage/languages) for more details.

<div style={{ marginBottom: "3rem" }} />

## How can I add metadata to my documents so that I can apply filters?

When providing your documents in the input format (see [here](https://haystack.deepset.ai/docs/v0.9.0/documentstoremd#Input-Format))
When providing your documents in the input format (see [here](https://haystack.deepset.aihttps://haystack.deepset.ai/usage/document-store#input-format))
you can provide metadata information as a dictionary under the `meta` key. At query time, you can provide a `filters` argument
(most likely through `Pipelines.run()`) that specifies the accepted values for a certain metadata field
(for an example of what a `filters` dictionary might look like, please refer to [this example](https://haystack.deepset.ai/docs/v0.9.0/apiretrievermd#__init__))
(for an example of what a `filters` dictionary might look like, please refer to [this example](https://haystack.deepset.ai/reference/retriever#__init__))

<div style={{ marginBottom: "3rem" }} />

Expand All @@ -77,7 +77,7 @@ The confidence scores are in the range of 0 and 1 and reflect how confident the
Having a confidence score is particularly useful in cases where you need Haystack to work with a certain accuracy threshold.
Many of our users have built systems where predictions below a certain confidence value are routed on to a fallback system.

For more information on model confidence and how to tune it, please refer to [this section](https://haystack.deepset.ai/docs/v0.9.0/readermd#Confidence-Scores).
For more information on model confidence and how to tune it, please refer to [this section](https://haystack.deepset.ai/usage/reader#confidence-scores).

<div style={{ marginBottom: "3rem" }} />

Expand All @@ -95,5 +95,5 @@ Note that this also applies at evaluation where labels are written into their ow
In short, the FARMReader using a QA pipeline implementation that comes from our own
[FARM framework](https://github.com/deepset-ai/FARM) that we can more easily update and also optimize for performance.
By contrast, the TransformersReader uses a QA pipeline implementation that comes from HuggingFace's [Transformers](https://github.com/huggingface/transformers).
See [this section](https://haystack.deepset.ai/docs/v0.9.0/readermd#Deeper-Dive-FARM-vs-Transformers)
See [this section](https://haystack.deepset.ai/usage/reader#deeper-dive-farm-vs-transformers)
for a more details about their differences!
2 changes: 1 addition & 1 deletion docs/v0.9.0/usage/document_store.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -116,7 +116,7 @@ See API documentation for more info.

DocumentStores expect Documents in dictionary form, like that below.
They are loaded using the `DocumentStore.write_documents()` method.
See [Preprocessing](/docs/v0.9.0/preprocessingmd) for more information on the cleaning and splitting steps that will help you maximize Haystack's performance.
See [Preprocessing](/usage/preprocessing) for more information on the cleaning and splitting steps that will help you maximize Haystack's performance.

[//]: # "Add link to preprocessing section"

Expand Down
4 changes: 2 additions & 2 deletions docs/v0.9.0/usage/generator.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,8 @@ retriever and generator can be trained concurrently from the one loss signal.

**Tutorial**

Checkout our tutorial notebooks for a guide on how to build your own generative QA system with RAG ([here](/docs/v0.9.0/tutorial7md))
or with LFQA ([here](/docs/v0.9.0/tutorial12md)).
Checkout our tutorial notebooks for a guide on how to build your own generative QA system with RAG ([here](/tutorials/retrieval-augmented-generation))
or with LFQA ([here](/tutorials/pipelines)).

</div>

Expand Down
2 changes: 1 addition & 1 deletion docs/v0.9.0/usage/languages.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ The default model that is loaded in the DensePassageRetriever is for English.
We have created a [German DensePassageRetriever model](https://deepset.ai/germanquad) and know other teams who work on further languages.
If you have a language model and a question answering dataset in your own language, you can also train a DPR model using Haystack!
Below is a simplified example.
See [our tutorial](/docs/v0.9.0/tutorial9md) and also the [API reference](/docs/v0.9.0/apiretrievermd#train) for `DensePassageRetriever.train()` for more details.
See [our tutorial](/tutorials/train-dpr) and also the [API reference](/reference/retriever#train) for `DensePassageRetriever.train()` for more details.

```python
from haystack.retriever import DensePassageRetriever
Expand Down
4 changes: 2 additions & 2 deletions docs/v0.9.0/usage/optimization.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ retrieved_docs = retriever.retrieve(top_k=10)

## Metadata Filtering

Metadata can be attached to the documents which you index into your DocumentStore (see the input data format [here](/docs/v0.9.0/retrievermd)).
Metadata can be attached to the documents which you index into your DocumentStore (see the input data format [here](/usage/retriever)).
At query time, you can apply filters based on this metadata to limit the scope of your search and ensure your answers
come from a specific slice of your data.

Expand All @@ -96,7 +96,7 @@ This can reduce the work load of the retriever and also ensure that you get more

Filters are applied via the `filters` argument of the `Retriever` class. In practice, this argument will probably
be passed into the `Pipeline.run()` call, which will then route it on to the `Retriever` class
(see our the Arguments on the [Pipelines page](/docs/v0.9.0/pipelinesmd) for an explanation).
(see our the Arguments on the [Pipelines page](/usage/pipelines) for an explanation).

```python
pipeline.run(
Expand Down
6 changes: 3 additions & 3 deletions docs/v0.9.0/usage/pipelines.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,7 @@ To load, simply call:
pipeline.load_from_yaml(Path("sample.yaml"))
```

For another example YAML config, check out [this file](https://github.com/deepset-ai/haystack/blob/master/rest_api/pipelines.yaml).
For another example YAML config, check out [this file](https://github.com/deepset-ai/haystack/blob/master/rest_api/pipeline/pipelines.yaml).

<div style={{ marginBottom: "3rem" }} />

Expand Down Expand Up @@ -153,7 +153,7 @@ Or you can add decision nodes where only one "branch" is executed afterwards. Th
### Evaluation nodes

There are nodes in Haystack that are used to evaluate the performance of readers, retrievers and combine systems.
To get hands on with this kind of node, have a look at the [evaluation tutorial](/docs/v0.9.0/tutorial5md).
To get hands on with this kind of node, have a look at the [evaluation tutorial](/tutorials/evaluation).

<div style={{ marginBottom: "3rem" }} />

Expand Down Expand Up @@ -183,6 +183,6 @@ res = doc_pipe.run(query="How can I change my address?", top_k_retriever=3)

```

See also the [Pipelines API documentation](/docs/v0.9.0/apipipelinesmd) for more details.
See also the [Pipelines API documentation](/reference/pipelines) for more details.

We plan many more features around the new pipelines incl. parallelized execution, distributed execution, dry runs - so stay tuned ...
10 changes: 5 additions & 5 deletions docs/v0.9.0/usage/preprocessing.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ Haystack includes a suite of tools to:
- normalize white space
- split text into smaller pieces to optimize retrieval

Check out our [preprocessing tutorial](/docs/v0.9.0/tutorial8md) if you'd like to start working with code examples already!
Check out our [preprocessing tutorial](/tutorials/train-dpr) if you'd like to start working with code examples already!

These data preprocessing steps can have a big impact on the systems performance
and effective handling of data is key to getting the most out of Haystack.
Expand All @@ -30,7 +30,7 @@ docs = [
There are a range of different file converters in Haystack that
can extract text from files and cast them into the unified dictionary format shown above.
Haystack features support for txt, pdf and docx files and there is even a converter that leverages Apache Tika.
Please refer to [the API docs](/docs/v0.9.0/file_convertersmd) to see which converter best suits you.
Please refer to [the API docs](/reference/file-converters) to see which converter best suits you.

<Tabs
options={[
Expand Down Expand Up @@ -86,13 +86,13 @@ Please refer to [the API docs](/docs/v0.9.0/file_convertersmd) to see which conv
## Web Crawler

In Haystack, you will find a web crawler that will help you scrape text from websites and save it to file.
See the [API documentation](https://haystack.deepset.ai/docs/v0.9.0/apicrawlermd) for more details.
See the [API documentation](https://haystack.deepset.ai/reference/crawler) for more details.

```python
from haystack.connector import Crawler

crawler = Crawler()
docs = crawler.crawl(urls=["https://haystack.deepset.ai/docs/v0.9.0/get_startedmd"],
docs = crawler.crawl(urls=["https://haystack.deepset.ai/overview/get-started"],
output_dir="crawled_files",
filter_urls= ["haystack\.deepset\.ai\/docs\/"])
```
Expand All @@ -106,7 +106,7 @@ it is recommended that they are further processed in order to ensure optimal Ret
The `PreProcessor` takes one of the documents created by the converter as input,
performs various cleaning steps and splits them into multiple smaller documents.

For suggestions on how best to split your documents, see [Optimization](/docs/v0.9.0/optimizationmd)
For suggestions on how best to split your documents, see [Optimization](/usage/optimization)

```python
from haystack.preprocessor import PreProcessor
Expand Down
2 changes: 1 addition & 1 deletion docs/v0.9.0/usage/ranker.mdx
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Ranker

There are pure "semantic document search" use cases that do not need question answering functionality but only document ranking.
While the [Retriever](/docs/v0.9.0/retrievermd) is a perfect fit for document retrieval, we can further improve its results with the Ranker.
While the [Retriever](/usage/retriever) is a perfect fit for document retrieval, we can further improve its results with the Ranker.
For example, BM25 (sparse retriever) does not take into account semantics of the documents and the query but only their keywords.
The Ranker can re-rank the results of the retriever step by taking semantics into account.
Similar to the Reader, it is based on the latest language models.
Expand Down
6 changes: 3 additions & 3 deletions docs/v0.9.0/usage/retriever.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ Here are the combinations which are supported:
| Embedding | Y | Y | N | Y | Y |
| DPR | Y | Y | N | Y | Y |

See [Optimization](/docs/v0.9.0/optimizationmd) for suggestions on how to choose top-k values.
See [Optimization](/usage/optimization) for suggestions on how to choose top-k values.

<div style={{ marginBottom: "3rem" }} />

Expand Down Expand Up @@ -131,7 +131,7 @@ There are two design decisions that have made DPR particularly performant.
- Training with ‘In-batch negatives’ (gold labels are treated as negative examples for other samples in same batch) is highly efficient

In Haystack, you can simply download the pretrained encoders needed to start using DPR.
If you’d like to learn how to set up a DPR based system, have a look at the [tutorial](/docs/v0.9.0/tutorial6md)!
If you’d like to learn how to set up a DPR based system, have a look at the [tutorial](/tutorials/dense-passage-retrieval)!

<div style={{ marginBottom: "3rem" }} />

Expand Down Expand Up @@ -165,7 +165,7 @@ finder = ExtractiveQAPipeline(reader, retriever)

<div className="max-w-xl bg-yellow-light-theme border-l-8 border-yellow-dark-theme px-6 pt-6 pb-4 my-4 rounded-md dark:bg-yellow-900">

**Training DPR:** Haystack supports training of your own DPR model! Check out the [tutorial](/docs/v0.9.0/tutorial9md) to see how this is done!
**Training DPR:** Haystack supports training of your own DPR model! Check out the [tutorial](/tutorials/train-dpr) to see how this is done!

</div>

Expand Down

0 comments on commit 46ceb1a

Please sign in to comment.