Skip to content

Commit

Permalink
Tutorial update (#1166)
Browse files Browse the repository at this point in the history
* Add header / footer

* Add Milvus example

* Generate md files

* Fix mypy CI
  • Loading branch information
brandenchan committed Jun 11, 2021
1 parent 13edff1 commit 783893c
Show file tree
Hide file tree
Showing 33 changed files with 687 additions and 84 deletions.
26 changes: 23 additions & 3 deletions docs/_src/tutorials/tutorials/1.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,8 +42,8 @@ Make sure you enable the GPU runtime to experience decent speed in this tutorial
#! pip install farm-haystack

# Install the latest master of Haystack
!pip install grpcio-tools==1.34.1
!pip install git+https://github.com/deepset-ai/haystack.git
!pip install urllib3==1.25.4

```

Expand Down Expand Up @@ -71,8 +71,10 @@ You can start Elasticsearch on your local machine instance using Docker. If Dock


```python
# Recommended: Start Elasticsearch using Docker
#! docker run -d -p 9200:9200 -e "discovery.type=single-node" elasticsearch:7.9.2
# Recommended: Start Elasticsearch using Docker via the Haystack utility function
from haystack.utils import launch_es

launch_es()
```


Expand Down Expand Up @@ -236,3 +238,21 @@ prediction = pipe.run(query="Who is the father of Arya Stark?", top_k_retriever=
```python
print_answers(prediction, details="minimal")
```

## About us

This [Haystack](https://github.com/deepset-ai/haystack/) notebook was made with love by [deepset](https://deepset.ai/) in Berlin, Germany

We bring NLP to the industry via open source!
Our focus: Industry specific language models & large scale QA systems.

Some of our other work:
- [German BERT](https://deepset.ai/german-bert)
- [GermanQuAD and GermanDPR](https://deepset.ai/germanquad)
- [FARM](https://github.com/deepset-ai/FARM)

Get in touch:
[Twitter](https://twitter.com/deepset_ai) | [LinkedIn](https://www.linkedin.com/company/deepset-ai/) | [Slack](https://haystack.deepset.ai/community/join) | [GitHub Discussions](https://github.com/deepset-ai/haystack/discussions) | [Website](https://deepset.ai)

By the way: [we're hiring!](https://apply.workable.com/deepset/)

18 changes: 18 additions & 0 deletions docs/_src/tutorials/tutorials/10.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ The training of models that translate text queries into SPARQL queries is curren
#! pip install farm-haystack

# Install the latest master of Haystack
!pip install grpcio-tools==1.34.1
!pip install git+https://github.com/deepset-ai/haystack.git
```

Expand Down Expand Up @@ -129,3 +130,20 @@ print(result)
# Paraphrased question: What is the patronus of Hermione?
# Correct answer: Otter
```

## About us

This [Haystack](https://github.com/deepset-ai/haystack/) notebook was made with love by [deepset](https://deepset.ai/) in Berlin, Germany

We bring NLP to the industry via open source!
Our focus: Industry specific language models & large scale QA systems.

Some of our other work:
- [German BERT](https://deepset.ai/german-bert)
- [GermanQuAD and GermanDPR](https://deepset.ai/germanquad)
- [FARM](https://github.com/deepset-ai/FARM)

Get in touch:
[Twitter](https://twitter.com/deepset_ai) | [LinkedIn](https://www.linkedin.com/company/deepset-ai/) | [Slack](https://haystack.deepset.ai/community/join) | [GitHub Discussions](https://github.com/deepset-ai/haystack/discussions) | [Website](https://deepset.ai)

By the way: [we're hiring!](https://apply.workable.com/deepset/)
19 changes: 18 additions & 1 deletion docs/_src/tutorials/tutorials/11.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,8 +37,8 @@ These lines are to install Haystack through pip
#! pip install farm-haystack

# Install the latest master of Haystack
!pip install grpcio-tools==1.34.1
!pip install --upgrade git+https://github.com/deepset-ai/haystack.git
!pip install urllib3==1.25.4

# Install pygraphviz
!apt install libgraphviz-dev
Expand Down Expand Up @@ -392,3 +392,20 @@ pipeline.load_from_yaml(Path("sample.yaml"))

The possibilities are endless with the `Pipeline` class and we hope that this tutorial will inspire you
to build custom pipeplines that really work for your use case!

## About us

This [Haystack](https://github.com/deepset-ai/haystack/) notebook was made with love by [deepset](https://deepset.ai/) in Berlin, Germany

We bring NLP to the industry via open source!
Our focus: Industry specific language models & large scale QA systems.

Some of our other work:
- [German BERT](https://deepset.ai/german-bert)
- [GermanQuAD and GermanDPR](https://deepset.ai/germanquad)
- [FARM](https://github.com/deepset-ai/FARM)

Get in touch:
[Twitter](https://twitter.com/deepset_ai) | [LinkedIn](https://www.linkedin.com/company/deepset-ai/) | [Slack](https://haystack.deepset.ai/community/join) | [GitHub Discussions](https://github.com/deepset-ai/haystack/discussions) | [Website](https://deepset.ai)

By the way: [we're hiring!](https://apply.workable.com/deepset/)
20 changes: 19 additions & 1 deletion docs/_src/tutorials/tutorials/2.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,8 +37,9 @@ Make sure you enable the GPU runtime to experience decent speed in this tutorial
#! pip install farm-haystack

# Install the latest master of Haystack
!pip install grpcio-tools==1.34.1
!pip install git+https://github.com/deepset-ai/haystack.git
!pip install urllib3==1.25.4

```


Expand Down Expand Up @@ -88,3 +89,20 @@ reader.save(directory="my_model")
# If you want to load it at a later point, just do:
new_reader = FARMReader(model_name_or_path="my_model")
```

## About us

This [Haystack](https://github.com/deepset-ai/haystack/) notebook was made with love by [deepset](https://deepset.ai/) in Berlin, Germany

We bring NLP to the industry via open source!
Our focus: Industry specific language models & large scale QA systems.

Some of our other work:
- [German BERT](https://deepset.ai/german-bert)
- [GermanQuAD and GermanDPR](https://deepset.ai/germanquad)
- [FARM](https://github.com/deepset-ai/FARM)

Get in touch:
[Twitter](https://twitter.com/deepset_ai) | [LinkedIn](https://www.linkedin.com/company/deepset-ai/) | [Slack](https://haystack.deepset.ai/community/join) | [GitHub Discussions](https://github.com/deepset-ai/haystack/discussions) | [Website](https://deepset.ai)

By the way: [we're hiring!](https://apply.workable.com/deepset/)
20 changes: 19 additions & 1 deletion docs/_src/tutorials/tutorials/3.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,8 +37,9 @@ Make sure you enable the GPU runtime to experience decent speed in this tutorial
#! pip install farm-haystack

# Install the latest master of Haystack
!pip install grpcio-tools==1.34.1
!pip install git+https://github.com/deepset-ai/haystack.git
!pip install urllib3==1.25.4

```


Expand Down Expand Up @@ -183,3 +184,20 @@ prediction = pipe.run(query="Who is the father of Arya Stark?", top_k_retriever=
```python
print_answers(prediction, details="minimal")
```

## About us

This [Haystack](https://github.com/deepset-ai/haystack/) notebook was made with love by [deepset](https://deepset.ai/) in Berlin, Germany

We bring NLP to the industry via open source!
Our focus: Industry specific language models & large scale QA systems.

Some of our other work:
- [German BERT](https://deepset.ai/german-bert)
- [GermanQuAD and GermanDPR](https://deepset.ai/germanquad)
- [FARM](https://github.com/deepset-ai/FARM)

Get in touch:
[Twitter](https://twitter.com/deepset_ai) | [LinkedIn](https://www.linkedin.com/company/deepset-ai/) | [Slack](https://haystack.deepset.ai/community/join) | [GitHub Discussions](https://github.com/deepset-ai/haystack/discussions) | [Website](https://deepset.ai)

By the way: [we're hiring!](https://apply.workable.com/deepset/)
26 changes: 22 additions & 4 deletions docs/_src/tutorials/tutorials/4.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,8 +45,9 @@ Make sure you enable the GPU runtime to experience decent speed in this tutorial
#! pip install farm-haystack

# Install the latest master of Haystack
!pip install grpcio-tools==1.34.1
!pip install git+https://github.com/deepset-ai/haystack.git
!pip install urllib3==1.25.4

```


Expand All @@ -66,8 +67,10 @@ You can start Elasticsearch on your local machine instance using Docker. If Dock


```python
# Recommended: Start Elasticsearch using Docker
# ! docker run -d -p 9200:9200 -e "discovery.type=single-node" elasticsearch:7.9.2
# Recommended: Start Elasticsearch using Docker via the Haystack utility function
from haystack.utils import launch_es

launch_es()
```


Expand Down Expand Up @@ -154,6 +157,21 @@ pipe = FAQPipeline(retriever=retriever)
```python
prediction = pipe.run(query="How is the virus spreading?", top_k_retriever=10)
print_answers(prediction, details="all")
```

## About us

```
This [Haystack](https://github.com/deepset-ai/haystack/) notebook was made with love by [deepset](https://deepset.ai/) in Berlin, Germany

We bring NLP to the industry via open source!
Our focus: Industry specific language models & large scale QA systems.

Some of our other work:
- [German BERT](https://deepset.ai/german-bert)
- [GermanQuAD and GermanDPR](https://deepset.ai/germanquad)
- [FARM](https://github.com/deepset-ai/FARM)

Get in touch:
[Twitter](https://twitter.com/deepset_ai) | [LinkedIn](https://www.linkedin.com/company/deepset-ai/) | [Slack](https://haystack.deepset.ai/community/join) | [GitHub Discussions](https://github.com/deepset-ai/haystack/discussions) | [Website](https://deepset.ai)

By the way: [we're hiring!](https://apply.workable.com/deepset/)
42 changes: 27 additions & 15 deletions docs/_src/tutorials/tutorials/5.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,8 +36,9 @@ You can start Elasticsearch on your local machine instance using Docker. If Dock
#! pip install farm-haystack

# Install the latest master of Haystack
!pip install grpcio-tools==1.34.1
!pip install git+https://github.com/deepset-ai/haystack.git
!pip install urllib3==1.25.4

```


Expand Down Expand Up @@ -117,12 +118,6 @@ document_store.add_eval_data(

# Let's prepare the labels that we need for the retriever and the reader
labels = document_store.get_all_labels_aggregated(index=label_index)
q_to_l_dict = {
l.question: {
"retriever": l,
"reader": l
} for l in labels
}
```

## Initialize components of QA-System
Expand Down Expand Up @@ -157,12 +152,13 @@ reader = FARMReader("deepset/roberta-base-squad2", top_k_per_candidate=4, return

```


```python
from haystack.eval import EvalAnswers, EvalDocuments

# Here we initialize the nodes that perform evaluation
eval_retriever = EvalRetriever()
eval_reader = EvalReader()
eval_retriever = EvalDocuments()
eval_reader = EvalAnswers()
```

## Evaluation of Retriever
Expand Down Expand Up @@ -211,18 +207,18 @@ from haystack import Pipeline
# Here is the pipeline definition
p = Pipeline()
p.add_node(component=retriever, name="ESRetriever", inputs=["Query"])
p.add_node(component=eval_retriever, name="EvalDocuments", inputs=["ESRetriever"])
p.add_node(component=reader, name="QAReader", inputs=["EvalDocuments"])
p.add_node(component=eval_reader, name="EvalAnswers", inputs=["QAReader"])
p.add_node(component=eval_retriever, name="EvalRetriever", inputs=["ESRetriever"])
p.add_node(component=reader, name="QAReader", inputs=["EvalRetriever"])
p.add_node(component=eval_reader, name="EvalReader", inputs=["QAReader"])
results = []
```


```python
# This is how to run the pipeline
for q, l in q_to_l_dict.items():
for l in labels:
res = p.run(
query=q,
query=l.question,
top_k_retriever=10,
labels=l,
top_k_reader=10,
Expand All @@ -244,5 +240,21 @@ print()
reader.print_time()
print()
eval_reader.print(mode="pipeline")

```

## About us

This [Haystack](https://github.com/deepset-ai/haystack/) notebook was made with love by [deepset](https://deepset.ai/) in Berlin, Germany

We bring NLP to the industry via open source!
Our focus: Industry specific language models & large scale QA systems.

Some of our other work:
- [German BERT](https://deepset.ai/german-bert)
- [GermanQuAD and GermanDPR](https://deepset.ai/germanquad)
- [FARM](https://github.com/deepset-ai/FARM)

Get in touch:
[Twitter](https://twitter.com/deepset_ai) | [LinkedIn](https://www.linkedin.com/company/deepset-ai/) | [Slack](https://haystack.deepset.ai/community/join) | [GitHub Discussions](https://github.com/deepset-ai/haystack/discussions) | [Website](https://deepset.ai)

By the way: [we're hiring!](https://apply.workable.com/deepset/)
40 changes: 37 additions & 3 deletions docs/_src/tutorials/tutorials/6.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,8 +78,9 @@ Make sure you enable the GPU runtime to experience decent speed in this tutorial
#! pip install farm-haystack

# Install the latest master of Haystack
!pip install grpcio-tools==1.34.1
!pip install git+https://github.com/deepset-ai/haystack.git
!pip install urllib3==1.25.4

```


Expand All @@ -94,6 +95,8 @@ from haystack.utils import print_answers

### Document Store

#### Option 1: FAISS

FAISS is a library for efficient similarity search on a cluster of dense vectors.
The `FAISSDocumentStore` uses a SQL(SQLite in-memory be default) database under-the-hood
to store the document text and other meta data. The vector embeddings of the text are
Expand All @@ -104,11 +107,27 @@ For more info on which suits your use case: https://github.com/facebookresearch/


```python
from haystack.document_store.faiss import FAISSDocumentStore
from haystack.document_store import FAISSDocumentStore

document_store = FAISSDocumentStore(faiss_index_factory_str="Flat")
```

#### Option 2: Milvus

Milvus is an open source database library that is also optimized for vector similarity searches like FAISS.
Like FAISS it has both a "Flat" and "HNSW" mode but it outperforms FAISS when it comes to dynamic data management.
It does require a little more setup, however, as it is run through Docker and requires the setup of some config files.
See [their docs](https://milvus.io/docs/v1.0.0/milvus_docker-cpu.md) for more details.


```python
from haystack.utils import launch_milvus
from haystack.document_store import MilvusDocumentStore

launch_milvus()
document_store = MilvusDocumentStore()
```

### Cleaning & indexing documents

Similarly to the previous tutorials, we download, convert and index some Game of Thrones articles to our DocumentStore
Expand Down Expand Up @@ -202,6 +221,21 @@ prediction = pipe.run(query="Who created the Dothraki vocabulary?", top_k_retrie

```python
print_answers(prediction, details="minimal")
```

## About us

```
This [Haystack](https://github.com/deepset-ai/haystack/) notebook was made with love by [deepset](https://deepset.ai/) in Berlin, Germany

We bring NLP to the industry via open source!
Our focus: Industry specific language models & large scale QA systems.

Some of our other work:
- [German BERT](https://deepset.ai/german-bert)
- [GermanQuAD and GermanDPR](https://deepset.ai/germanquad)
- [FARM](https://github.com/deepset-ai/FARM)

Get in touch:
[Twitter](https://twitter.com/deepset_ai) | [LinkedIn](https://www.linkedin.com/company/deepset-ai/) | [Slack](https://haystack.deepset.ai/community/join) | [GitHub Discussions](https://github.com/deepset-ai/haystack/discussions) | [Website](https://deepset.ai)

By the way: [we're hiring!](https://apply.workable.com/deepset/)
Loading

0 comments on commit 783893c

Please sign in to comment.