merged master into release branch

deepset-ai · PiffPaffM · Sep 23, 2021 · Sep 15, 2021 · Sep 15, 2021 · Sep 15, 2021
commit ba2d7d103bb268d351c4e1d526aa3617f2b14d8d
diff --git a/docs/latest/components/document_store.mdx b/docs/latest/components/document_store.mdx
@@ -33,9 +33,42 @@ Initialising a new DocumentStore within Haystack is straight forward.
  <pre>
  <code>document_store = ElasticsearchDocumentStore()</code>
  </pre>
- Note that we also support <a href="https://opendistro.github.io/for-elasticsearch-docs/">Open Distro for Elasticsearch</a>.
- Follow <a href="https://opendistro.github.io/for-elasticsearch-docs/docs/install/">their documentation</a>&nbsp;
- to run it and connect to it using Haystack's `OpenDistroElasticsearchDocumentStore` class.
+ </div>
+ )
+ },
+ {
+ title: "Open Distro for Elasticsearch",
+ content: (
+ <div>
+ Learn how to get started <a href="https://opendistro.github.io/for-elasticsearch-docs/#get-started">here</a>.
+ If you have Docker set up, we recommend pulling the Docker image and running it.
+ <pre>
+ <code>docker pull amazon/opendistro-for-elasticsearch:1.13.2</code>
+ <code>docker run -p 9200:9200 -p 9600:9600 -e "discovery.type=single-node" amazon/opendistro-for-elasticsearch:1.13.2</code>
+ </pre>
+ Next you can initialize the Haystack object that will connect to this instance.
+ <pre>
+ <code>from haystack.document_store import OpenDistroElasticsearchDocumentStore</code>
+ <code>document_store = OpenDistroElasticsearchDocumentStore()</code>
+ </pre>
+ </div>
+ )
+ },
+ {
+ title: "OpenSearch",
+ content: (
+ <div>
+ Learn how to get started <a href="https://opensearch.org/docs/#docker-quickstart">here</a>.
+ If you have Docker set up, we recommend pulling the Docker image and running it.
+ <pre>
+ <code>docker pull opensearchproject/opensearch:1.0.1</code>
+ <code>docker run -p 9200:9200 -p 9600:9600 -e "discovery.type=single-node" opensearchproject/opensearch:1.0.1</code>
+ </pre>
+ Next you can initialize the Haystack object that will connect to this instance.
+ <pre>
+ <code>from haystack.document_store import OpenSearchDocumentStore</code>
+ <code>document_store = OpenSearchDocumentStore()</code>
+ </pre>
  </div>
  )
  },
@@ -214,6 +247,39 @@ The Document Stores have different characteristics. You should choose one depend
  </div>
  )
  },
+ {
+ title: "Open Distro for Elasticsearch",
+ content: (
+ <div>
+ <strong>Pros:</strong>
+ <ul>
+ <li>Fully open source (Apache 2.0 license)</li>
+ <li>Essentially the same features as Elasticsearch</li>
+ </ul>
+ <strong>Cons:</strong>
+ <ul>
+ <li>Slow for dense retrieval with more than ~ 1 Mio documents</li>
+ </ul>
+ </div>
+ )
+ },
+ {
+ title: "OpenSearch",
+ content: (
+ <div>
+ <strong>Pros:</strong>
+ <ul>
+ <li>Fully open source (Apache 2.0 license)</li>
+ <li>Essentially the same features as Elasticsearch</li>
+ <li>Has more support for vector similarity comparisons and approximate nearest neighbours algorithms</li>
+ </ul>
+ <strong>Cons:</strong>
+ <ul>
+ <li>Not as optimized as dedicated vector similarity options like Milvus and FAISS</li>
+ </ul>
+ </div>
+ )
+ },
  {
  title: "Milvus",
  content: (

diff --git a/docs/v0.10.0/components/classifier.mdx b/docs/v0.10.0/components/classifier.mdx
@@ -0,0 +1,43 @@
+# Classifier
+
+The Classifier Node is a transformer based classification model used to create predictions that can be attached to retrieved documents as metadata.
+For example, by using a sentiment model, you can label each document as being either positive or negative in sentiment.
+Through a tight integration with the HuggingFace model hub, you can easily load any classification model by simply supplying the model name.
+
+![image](/img/classifier.png)
+
+<div className="max-w-xl bg-yellow-light-theme border-l-8 border-yellow-dark-theme px-6 pt-6 pb-4 my-4 rounded-md dark:bg-yellow-900">
+
+Note that the Classifier is different from the Query Classifier.
+While the Query Classifier categorizes incoming queries in order to route them to different parts of the pipeline,
+the Classifier is used to create classification labels that can be attached to retrieved documents as metadata.
+
+</div>
+
+## Usage
+
+Initialize it as follows:
+
+``` python
+from haystack.classifier import FARMClassifier
+
+classifier_model = 'textattack/bert-base-uncased-imdb'
+classifier = FARMClassifier(model_name_or_path=classifier_model)
+```
+
+It slotted into a pipeline as follows:
+
+``` python
+pipeline = Pipeline()
+pipeline.add_node(component=retriever, name="Retriever", inputs=["Query"])
+pipeline.add_node(component=classifier, name='Classifier', inputs=['Retriever'])
+```
+
+It can also be run in isolation:
+
+``` python
+documents = classifier.predict(
+ query="",
+ documents = [doc1, doc2, doc3, ...]
+):
+```