Skip to content
This repository has been archived by the owner on Oct 20, 2022. It is now read-only.

Replace FARMRanker with SentenceTransformersRanker #169

Merged
merged 1 commit into from
Sep 30, 2021
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 8 additions & 8 deletions docs/latest/components/ranker.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -17,17 +17,17 @@ Alternatively, [this example](https://github.com/deepset-ai/FARM/blob/master/exa

<div style={{ marginBottom: "3rem" }} />

## FARMRanker
## SentenceTransformersRanker

### Description

The FARMRanker consists of a Transformer-based model for document re-ranking using the TextPairClassifier of [FARM](https://github.com/deepset-ai/FARM).
Given a text pair of query and passage, the TextPairClassifier either predicts label "1" if the pair is similar or label "0" if they are dissimilar (accompanied with a probability).
While the underlying model can vary (BERT, Roberta, DistilBERT, ...), the interface remains the same.
With a FARMRanker, you can:
The SentenceTransformersRanker consists of a Sentence Transformer based pre-trained Cross-Encoder model for Document Re-ranking (https://huggingface.co/cross-encoder).
Re-Ranking can be used on top of a retriever to boost the performance for document search. This is particularly useful if the retriever has a high recall but is bad in sorting the documents by relevance.
SentenceTransformerRanker handles Cross-Encoder models that use a single logit as similarity score (https://www.sbert.net/docs/pretrained-models/ce-msmarco.html#usage-with-transformers). This similarity score describes the similarity of the cross-encoded query and document text.

With a SentenceTransformersRanker, you can:

- Directly get predictions (re-ranked version of the supplied list of Document) via predict() if supplying a pre-trained model
- Take a plain language model (e.g. `bert-base-cased`) and train it for TextPairClassification via train()

<div style={{ marginBottom: "3rem" }} />

Expand All @@ -36,13 +36,13 @@ With a FARMRanker, you can:
```python
from haystack.document_store import ElasticsearchDocumentStore
from haystack.retriever import ElasticsearchRetriever
from haystack.ranker import FARMRanker
from haystack.ranker import SentenceTransformersRanker
from haystack import Pipeline

document_store = ElasticsearchDocumentStore()
...
retriever = ElasticsearchRetriever(document_store)
ranker = FARMRanker(model_name_or_path="saved_models/roberta-base-asnq-binary")
ranker = SentenceTransformersRanker(model_name_or_path="cross-encoder/ms-marco-MiniLM-L-12-v2")
...
p = Pipeline()
p.add_node(component=retriever, name="ESRetriever", inputs=["Query"])
Expand Down