Skip to content
This repository has been archived by the owner on Oct 20, 2022. It is now read-only.

New docs version #159

Merged
merged 9 commits into from
Sep 23, 2021
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
merged master into release branch
  • Loading branch information
PiffPaffM committed Sep 23, 2021
commit ba2d7d103bb268d351c4e1d526aa3617f2b14d8d
72 changes: 69 additions & 3 deletions docs/latest/components/document_store.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -33,9 +33,42 @@ Initialising a new DocumentStore within Haystack is straight forward.
<pre>
<code>document_store = ElasticsearchDocumentStore()</code>
</pre>
Note that we also support <a href="https://opendistro.github.io/for-elasticsearch-docs/">Open Distro for Elasticsearch</a>.
Follow <a href="https://opendistro.github.io/for-elasticsearch-docs/docs/install/">their documentation</a>&nbsp;
to run it and connect to it using Haystack's `OpenDistroElasticsearchDocumentStore` class.
</div>
)
},
{
title: "Open Distro for Elasticsearch",
content: (
<div>
Learn how to get started <a href="https://opendistro.github.io/for-elasticsearch-docs/#get-started">here</a>.
If you have Docker set up, we recommend pulling the Docker image and running it.
<pre>
<code>docker pull amazon/opendistro-for-elasticsearch:1.13.2</code>
<code>docker run -p 9200:9200 -p 9600:9600 -e "discovery.type=single-node" amazon/opendistro-for-elasticsearch:1.13.2</code>
</pre>
Next you can initialize the Haystack object that will connect to this instance.
<pre>
<code>from haystack.document_store import OpenDistroElasticsearchDocumentStore</code>
<code>document_store = OpenDistroElasticsearchDocumentStore()</code>
</pre>
</div>
)
},
{
title: "OpenSearch",
content: (
<div>
Learn how to get started <a href="https://opensearch.org/docs/#docker-quickstart">here</a>.
If you have Docker set up, we recommend pulling the Docker image and running it.
<pre>
<code>docker pull opensearchproject/opensearch:1.0.1</code>
<code>docker run -p 9200:9200 -p 9600:9600 -e "discovery.type=single-node" opensearchproject/opensearch:1.0.1</code>
</pre>
Next you can initialize the Haystack object that will connect to this instance.
<pre>
<code>from haystack.document_store import OpenSearchDocumentStore</code>
<code>document_store = OpenSearchDocumentStore()</code>
</pre>
</div>
)
},
Expand Down Expand Up @@ -214,6 +247,39 @@ The Document Stores have different characteristics. You should choose one depend
</div>
)
},
{
title: "Open Distro for Elasticsearch",
content: (
<div>
<strong>Pros:</strong>
<ul>
<li>Fully open source (Apache 2.0 license)</li>
<li>Essentially the same features as Elasticsearch</li>
</ul>
<strong>Cons:</strong>
<ul>
<li>Slow for dense retrieval with more than ~ 1 Mio documents</li>
</ul>
</div>
)
},
{
title: "OpenSearch",
content: (
<div>
<strong>Pros:</strong>
<ul>
<li>Fully open source (Apache 2.0 license)</li>
<li>Essentially the same features as Elasticsearch</li>
<li>Has more support for vector similarity comparisons and approximate nearest neighbours algorithms</li>
</ul>
<strong>Cons:</strong>
<ul>
<li>Not as optimized as dedicated vector similarity options like Milvus and FAISS</li>
</ul>
</div>
)
},
{
title: "Milvus",
content: (
Expand Down
43 changes: 43 additions & 0 deletions docs/v0.10.0/components/classifier.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
# Classifier

The Classifier Node is a transformer based classification model used to create predictions that can be attached to retrieved documents as metadata.
For example, by using a sentiment model, you can label each document as being either positive or negative in sentiment.
Through a tight integration with the HuggingFace model hub, you can easily load any classification model by simply supplying the model name.

![image](/img/classifier.png)

<div className="max-w-xl bg-yellow-light-theme border-l-8 border-yellow-dark-theme px-6 pt-6 pb-4 my-4 rounded-md dark:bg-yellow-900">

Note that the Classifier is different from the Query Classifier.
While the Query Classifier categorizes incoming queries in order to route them to different parts of the pipeline,
the Classifier is used to create classification labels that can be attached to retrieved documents as metadata.

</div>

## Usage

Initialize it as follows:

``` python
from haystack.classifier import FARMClassifier

classifier_model = 'textattack/bert-base-uncased-imdb'
classifier = FARMClassifier(model_name_or_path=classifier_model)
```

It slotted into a pipeline as follows:

``` python
pipeline = Pipeline()
pipeline.add_node(component=retriever, name="Retriever", inputs=["Query"])
pipeline.add_node(component=classifier, name='Classifier', inputs=['Retriever'])
```

It can also be run in isolation:

``` python
documents = classifier.predict(
query="",
documents = [doc1, doc2, doc3, ...]
):
```
Loading
You are viewing a condensed version of this merge commit. You can view the full changes here.