Tutorial update (#1166)

* Add header / footer * Add Milvus example * Generate md files * Fix mypy CI
deepset-ai · Jun 11, 2021 · 783893c · 783893c
1 parent 13edff1
commit 783893c
Show file tree

Hide file tree

Showing 33 changed files with 687 additions and 84 deletions.
diff --git a/docs/_src/tutorials/tutorials/1.md b/docs/_src/tutorials/tutorials/1.md
@@ -42,8 +42,8 @@ Make sure you enable the GPU runtime to experience decent speed in this tutorial
 #! pip install farm-haystack
 
 # Install the latest master of Haystack
+!pip install grpcio-tools==1.34.1
 !pip install git+https://github.com/deepset-ai/haystack.git
-!pip install urllib3==1.25.4
 
 ```
 
@@ -71,8 +71,10 @@ You can start Elasticsearch on your local machine instance using Docker. If Dock
 
 
 ```python
-# Recommended: Start Elasticsearch using Docker
-#! docker run -d -p 9200:9200 -e "discovery.type=single-node" elasticsearch:7.9.2
+# Recommended: Start Elasticsearch using Docker via the Haystack utility function
+from haystack.utils import launch_es
+
+launch_es()
 ```
 
 
@@ -236,3 +238,21 @@ prediction = pipe.run(query="Who is the father of Arya Stark?", top_k_retriever=
 ```python
 print_answers(prediction, details="minimal")
 ```
+
+## About us
+
+This [Haystack](https://github.com/deepset-ai/haystack/) notebook was made with love by [deepset](https://deepset.ai/) in Berlin, Germany
+
+We bring NLP to the industry via open source! 
+Our focus: Industry specific language models & large scale QA systems. 
+
+Some of our other work: 
+- [German BERT](https://deepset.ai/german-bert)
+- [GermanQuAD and GermanDPR](https://deepset.ai/germanquad)
+- [FARM](https://github.com/deepset-ai/FARM)
+
+Get in touch:
+[Twitter](https://twitter.com/deepset_ai) | [LinkedIn](https://www.linkedin.com/company/deepset-ai/) | [Slack](https://haystack.deepset.ai/community/join) | [GitHub Discussions](https://github.com/deepset-ai/haystack/discussions) | [Website](https://deepset.ai)
+
+By the way: [we're hiring!](https://apply.workable.com/deepset/) 
+
diff --git a/docs/_src/tutorials/tutorials/10.md b/docs/_src/tutorials/tutorials/10.md
@@ -21,6 +21,7 @@ The training of models that translate text queries into SPARQL queries is curren
 #! pip install farm-haystack
 
 # Install the latest master of Haystack
+!pip install grpcio-tools==1.34.1
 !pip install git+https://github.com/deepset-ai/haystack.git
 ```
 
@@ -129,3 +130,20 @@ print(result)
 # Paraphrased question: What is the patronus of Hermione?
 # Correct answer: Otter
 ```
+
+## About us
+
+This [Haystack](https://github.com/deepset-ai/haystack/) notebook was made with love by [deepset](https://deepset.ai/) in Berlin, Germany
+
+We bring NLP to the industry via open source! 
+Our focus: Industry specific language models & large scale QA systems. 
+
+Some of our other work: 
+- [German BERT](https://deepset.ai/german-bert)
+- [GermanQuAD and GermanDPR](https://deepset.ai/germanquad)
+- [FARM](https://github.com/deepset-ai/FARM)
+
+Get in touch:
+[Twitter](https://twitter.com/deepset_ai) | [LinkedIn](https://www.linkedin.com/company/deepset-ai/) | [Slack](https://haystack.deepset.ai/community/join) | [GitHub Discussions](https://github.com/deepset-ai/haystack/discussions) | [Website](https://deepset.ai)
+
+By the way: [we're hiring!](https://apply.workable.com/deepset/) 
diff --git a/docs/_src/tutorials/tutorials/11.md b/docs/_src/tutorials/tutorials/11.md
@@ -37,8 +37,8 @@ These lines are to install Haystack through pip
 #! pip install farm-haystack
 
 # Install the latest master of Haystack
+!pip install grpcio-tools==1.34.1
 !pip install --upgrade git+https://github.com/deepset-ai/haystack.git
-!pip install urllib3==1.25.4
 
 # Install pygraphviz
 !apt install libgraphviz-dev
@@ -392,3 +392,20 @@ pipeline.load_from_yaml(Path("sample.yaml"))
 
 The possibilities are endless with the `Pipeline` class and we hope that this tutorial will inspire you
 to build custom pipeplines that really work for your use case!
+
+## About us
+
+This [Haystack](https://github.com/deepset-ai/haystack/) notebook was made with love by [deepset](https://deepset.ai/) in Berlin, Germany
+
+We bring NLP to the industry via open source! 
+Our focus: Industry specific language models & large scale QA systems. 
+
+Some of our other work: 
+- [German BERT](https://deepset.ai/german-bert)
+- [GermanQuAD and GermanDPR](https://deepset.ai/germanquad)
+- [FARM](https://github.com/deepset-ai/FARM)
+
+Get in touch:
+[Twitter](https://twitter.com/deepset_ai) | [LinkedIn](https://www.linkedin.com/company/deepset-ai/) | [Slack](https://haystack.deepset.ai/community/join) | [GitHub Discussions](https://github.com/deepset-ai/haystack/discussions) | [Website](https://deepset.ai)
+
+By the way: [we're hiring!](https://apply.workable.com/deepset/) 
diff --git a/docs/_src/tutorials/tutorials/2.md b/docs/_src/tutorials/tutorials/2.md
@@ -37,8 +37,9 @@ Make sure you enable the GPU runtime to experience decent speed in this tutorial
 #! pip install farm-haystack
 
 # Install the latest master of Haystack
+!pip install grpcio-tools==1.34.1
 !pip install git+https://github.com/deepset-ai/haystack.git
-!pip install urllib3==1.25.4
+
 ```
 
 
@@ -88,3 +89,20 @@ reader.save(directory="my_model")
 # If you want to load it at a later point, just do:
 new_reader = FARMReader(model_name_or_path="my_model")
 ```
+
+## About us
+
+This [Haystack](https://github.com/deepset-ai/haystack/) notebook was made with love by [deepset](https://deepset.ai/) in Berlin, Germany
+
+We bring NLP to the industry via open source! 
+Our focus: Industry specific language models & large scale QA systems. 
+
+Some of our other work: 
+- [German BERT](https://deepset.ai/german-bert)
+- [GermanQuAD and GermanDPR](https://deepset.ai/germanquad)
+- [FARM](https://github.com/deepset-ai/FARM)
+
+Get in touch:
+[Twitter](https://twitter.com/deepset_ai) | [LinkedIn](https://www.linkedin.com/company/deepset-ai/) | [Slack](https://haystack.deepset.ai/community/join) | [GitHub Discussions](https://github.com/deepset-ai/haystack/discussions) | [Website](https://deepset.ai)
+
+By the way: [we're hiring!](https://apply.workable.com/deepset/) 
diff --git a/docs/_src/tutorials/tutorials/3.md b/docs/_src/tutorials/tutorials/3.md
@@ -37,8 +37,9 @@ Make sure you enable the GPU runtime to experience decent speed in this tutorial
 #! pip install farm-haystack
 
 # Install the latest master of Haystack
+!pip install grpcio-tools==1.34.1
 !pip install git+https://github.com/deepset-ai/haystack.git
-!pip install urllib3==1.25.4
+
 ```
 
 
@@ -183,3 +184,20 @@ prediction = pipe.run(query="Who is the father of Arya Stark?", top_k_retriever=
 ```python
 print_answers(prediction, details="minimal")
 ```
+
+## About us
+
+This [Haystack](https://github.com/deepset-ai/haystack/) notebook was made with love by [deepset](https://deepset.ai/) in Berlin, Germany
+
+We bring NLP to the industry via open source! 
+Our focus: Industry specific language models & large scale QA systems. 
+
+Some of our other work: 
+- [German BERT](https://deepset.ai/german-bert)
+- [GermanQuAD and GermanDPR](https://deepset.ai/germanquad)
+- [FARM](https://github.com/deepset-ai/FARM)
+
+Get in touch:
+[Twitter](https://twitter.com/deepset_ai) | [LinkedIn](https://www.linkedin.com/company/deepset-ai/) | [Slack](https://haystack.deepset.ai/community/join) | [GitHub Discussions](https://github.com/deepset-ai/haystack/discussions) | [Website](https://deepset.ai)
+
+By the way: [we're hiring!](https://apply.workable.com/deepset/) 
diff --git a/docs/_src/tutorials/tutorials/4.md b/docs/_src/tutorials/tutorials/4.md
@@ -45,8 +45,9 @@ Make sure you enable the GPU runtime to experience decent speed in this tutorial
 #! pip install farm-haystack
 
 # Install the latest master of Haystack
+!pip install grpcio-tools==1.34.1
 !pip install git+https://github.com/deepset-ai/haystack.git
-!pip install urllib3==1.25.4
+
 ```
 
 
@@ -66,8 +67,10 @@ You can start Elasticsearch on your local machine instance using Docker. If Dock
 
 
 ```python
-# Recommended: Start Elasticsearch using Docker
-# ! docker run -d -p 9200:9200 -e "discovery.type=single-node" elasticsearch:7.9.2
+# Recommended: Start Elasticsearch using Docker via the Haystack utility function
+from haystack.utils import launch_es
+
+launch_es()
 ```
 
 
@@ -154,6 +157,21 @@ pipe = FAQPipeline(retriever=retriever)
 ```python
 prediction = pipe.run(query="How is the virus spreading?", top_k_retriever=10)
 print_answers(prediction, details="all")
+```
 
+## About us
 
-```
+This [Haystack](https://github.com/deepset-ai/haystack/) notebook was made with love by [deepset](https://deepset.ai/) in Berlin, Germany
+
+We bring NLP to the industry via open source! 
+Our focus: Industry specific language models & large scale QA systems. 
+
+Some of our other work: 
+- [German BERT](https://deepset.ai/german-bert)
+- [GermanQuAD and GermanDPR](https://deepset.ai/germanquad)
+- [FARM](https://github.com/deepset-ai/FARM)
+
+Get in touch:
+[Twitter](https://twitter.com/deepset_ai) | [LinkedIn](https://www.linkedin.com/company/deepset-ai/) | [Slack](https://haystack.deepset.ai/community/join) | [GitHub Discussions](https://github.com/deepset-ai/haystack/discussions) | [Website](https://deepset.ai)
+
+By the way: [we're hiring!](https://apply.workable.com/deepset/) 
diff --git a/docs/_src/tutorials/tutorials/5.md b/docs/_src/tutorials/tutorials/5.md
@@ -36,8 +36,9 @@ You can start Elasticsearch on your local machine instance using Docker. If Dock
 #! pip install farm-haystack
 
 # Install the latest master of Haystack
+!pip install grpcio-tools==1.34.1
 !pip install git+https://github.com/deepset-ai/haystack.git
-!pip install urllib3==1.25.4
+
 ```
 
 
@@ -117,12 +118,6 @@ document_store.add_eval_data(
 
 # Let's prepare the labels that we need for the retriever and the reader
 labels = document_store.get_all_labels_aggregated(index=label_index)
-q_to_l_dict = {
- l.question: {
- "retriever": l,
- "reader": l
- } for l in labels
-}
 ```
 
 ## Initialize components of QA-System
@@ -157,12 +152,13 @@ reader = FARMReader("deepset/roberta-base-squad2", top_k_per_candidate=4, return
 
 ```
 
+
 ```python
 from haystack.eval import EvalAnswers, EvalDocuments
 
 # Here we initialize the nodes that perform evaluation
-eval_retriever = EvalRetriever()
-eval_reader = EvalReader()
+eval_retriever = EvalDocuments()
+eval_reader = EvalAnswers()
 ```
 
 ## Evaluation of Retriever
@@ -211,18 +207,18 @@ from haystack import Pipeline
 # Here is the pipeline definition
 p = Pipeline()
 p.add_node(component=retriever, name="ESRetriever", inputs=["Query"])
-p.add_node(component=eval_retriever, name="EvalDocuments", inputs=["ESRetriever"])
-p.add_node(component=reader, name="QAReader", inputs=["EvalDocuments"])
-p.add_node(component=eval_reader, name="EvalAnswers", inputs=["QAReader"])
+p.add_node(component=eval_retriever, name="EvalRetriever", inputs=["ESRetriever"])
+p.add_node(component=reader, name="QAReader", inputs=["EvalRetriever"])
+p.add_node(component=eval_reader, name="EvalReader", inputs=["QAReader"])
 results = []
 ```
 
 
 ```python
 # This is how to run the pipeline
-for q, l in q_to_l_dict.items():
+for l in labels:
  res = p.run(
- query=q,
+ query=l.question,
  top_k_retriever=10,
  labels=l,
  top_k_reader=10,
@@ -244,5 +240,21 @@ print()
 reader.print_time()
 print()
 eval_reader.print(mode="pipeline")
-
 ```
+
+## About us
+
+This [Haystack](https://github.com/deepset-ai/haystack/) notebook was made with love by [deepset](https://deepset.ai/) in Berlin, Germany
+
+We bring NLP to the industry via open source! 
+Our focus: Industry specific language models & large scale QA systems. 
+
+Some of our other work: 
+- [German BERT](https://deepset.ai/german-bert)
+- [GermanQuAD and GermanDPR](https://deepset.ai/germanquad)
+- [FARM](https://github.com/deepset-ai/FARM)
+
+Get in touch:
+[Twitter](https://twitter.com/deepset_ai) | [LinkedIn](https://www.linkedin.com/company/deepset-ai/) | [Slack](https://haystack.deepset.ai/community/join) | [GitHub Discussions](https://github.com/deepset-ai/haystack/discussions) | [Website](https://deepset.ai)
+
+By the way: [we're hiring!](https://apply.workable.com/deepset/) 
diff --git a/docs/_src/tutorials/tutorials/6.md b/docs/_src/tutorials/tutorials/6.md
@@ -78,8 +78,9 @@ Make sure you enable the GPU runtime to experience decent speed in this tutorial
 #! pip install farm-haystack
 
 # Install the latest master of Haystack
+!pip install grpcio-tools==1.34.1
 !pip install git+https://github.com/deepset-ai/haystack.git
-!pip install urllib3==1.25.4
+
 ```
 
 
@@ -94,6 +95,8 @@ from haystack.utils import print_answers
 
 ### Document Store
 
+#### Option 1: FAISS
+
 FAISS is a library for efficient similarity search on a cluster of dense vectors.
 The `FAISSDocumentStore` uses a SQL(SQLite in-memory be default) database under-the-hood
 to store the document text and other meta data. The vector embeddings of the text are
@@ -104,11 +107,27 @@ For more info on which suits your use case: https://github.com/facebookresearch/
 
 
 ```python
-from haystack.document_store.faiss import FAISSDocumentStore
+from haystack.document_store import FAISSDocumentStore
 
 document_store = FAISSDocumentStore(faiss_index_factory_str="Flat")
 ```
 
+#### Option 2: Milvus
+
+Milvus is an open source database library that is also optimized for vector similarity searches like FAISS.
+Like FAISS it has both a "Flat" and "HNSW" mode but it outperforms FAISS when it comes to dynamic data management.
+It does require a little more setup, however, as it is run through Docker and requires the setup of some config files.
+See [their docs](https://milvus.io/docs/v1.0.0/milvus_docker-cpu.md) for more details.
+
+
+```python
+from haystack.utils import launch_milvus
+from haystack.document_store import MilvusDocumentStore
+
+launch_milvus()
+document_store = MilvusDocumentStore()
+```
+
 ### Cleaning & indexing documents
 
 Similarly to the previous tutorials, we download, convert and index some Game of Thrones articles to our DocumentStore
@@ -202,6 +221,21 @@ prediction = pipe.run(query="Who created the Dothraki vocabulary?", top_k_retrie
 
 ```python
 print_answers(prediction, details="minimal")
+```
 
+## About us
 
-```
+This [Haystack](https://github.com/deepset-ai/haystack/) notebook was made with love by [deepset](https://deepset.ai/) in Berlin, Germany
+
+We bring NLP to the industry via open source! 
+Our focus: Industry specific language models & large scale QA systems. 
+
+Some of our other work: 
+- [German BERT](https://deepset.ai/german-bert)
+- [GermanQuAD and GermanDPR](https://deepset.ai/germanquad)
+- [FARM](https://github.com/deepset-ai/FARM)
+
+Get in touch:
+[Twitter](https://twitter.com/deepset_ai) | [LinkedIn](https://www.linkedin.com/company/deepset-ai/) | [Slack](https://haystack.deepset.ai/community/join) | [GitHub Discussions](https://github.com/deepset-ai/haystack/discussions) | [Website](https://deepset.ai)
+
+By the way: [we're hiring!](https://apply.workable.com/deepset/)