Skip to content

Commit

Permalink
Redesign primitives - Document, Answer, Label (deepset-ai#1398)
Browse files Browse the repository at this point in the history
* first draft / notes on new primitives

* wip label / feedback refactor

* rename doc.text -> doc.content. add doc.content_type

* add datatype for content

* remove faq_question_field from ES and weaviate. rename text_field -> content_field in docstores. update tutorials for content field

* update converters for . Add warning for empty

* renam label.question -> label.query. Allow sorting of Answers.

* WIP primitives

* update ui/reader for new Answer format

* Improve Label. First refactoring of MultiLabel. Adjust eval code

* fixed workflow conflict with introducing new one (deepset-ai#1472)

* Add latest docstring and tutorial changes

* make add_eval_data() work again

* fix reader formats. WIP fix _extract_docs_and_labels_from_dict

* fix test reader

* Add latest docstring and tutorial changes

* fix another test case for reader

* fix mypy in farm reader.eval()

* fix mypy in farm reader.eval()

* WIP ORM refactor

* Add latest docstring and tutorial changes

* fix mypy weaviate

* make label and multilabel dataclasses

* bump mypy env in CI to python 3.8

* WIP refactor Label ORM

* WIP refactor Label ORM

* simplify tests for individual doc stores

* WIP refactoring markers of tests

* test alternative approach for tests with existing parametrization

* WIP refactor ORMs

* fix skip logic of already parametrized tests

* fix weaviate behaviour in tests - not parametrizing it in our general test cases.

* Add latest docstring and tutorial changes

* fix some tests

* remove sql from document_store_types

* fix markers for generator and pipeline test

* remove inmemory marker

* remove unneeded elasticsearch markers

* add dataclasses-json dependency. adjust ORM to just store JSON repr

* ignore type as dataclasses_json seems to miss functionality here

* update readme and contributing.md

* update contributing

* adjust example

* fix duplicate doc handling for custom index

* Add latest docstring and tutorial changes

* fix some ORM issues. fix get_all_labels_aggregated.

* update drop flags where get_all_labels_aggregated() was used before

* Add latest docstring and tutorial changes

* add to_json(). add + fix tests

* fix no_answer handling in label / multilabel

* fix duplicate docs in memory doc store. change primary key for sql doc table

* fix mypy issues

* fix mypy issues

* haystack/retriever/base.py

* fix test_write_document_meta[elastic]

* fix test_elasticsearch_custom_fields

* fix test_labels[elastic]

* fix crawler

* fix converter

* fix docx converter

* fix preprocessor

* fix test_utils

* fix tfidf retriever. fix selection of docstore in tests with multiple fixtures / parameterizations

* Add latest docstring and tutorial changes

* fix crawler test. fix ocrconverter attribute

* fix test_elasticsearch_custom_query

* fix generator pipeline

* fix ocr converter

* fix ragenerator

* Add latest docstring and tutorial changes

* fix test_load_and_save_yaml for elasticsearch

* fixes for pipeline tests

* fix faq pipeline

* fix pipeline tests

* Add latest docstring and tutorial changes

* fix weaviate

* Add latest docstring and tutorial changes

* trigger CI

* satisfy mypy

* Add latest docstring and tutorial changes

* satisfy mypy

* Add latest docstring and tutorial changes

* trigger CI

* fix question generation test

* fix ray. fix Q-generation

* fix translator test

* satisfy mypy

* wip refactor feedback rest api

* fix rest api feedback endpoint

* fix doc classifier

* remove relation of Labels -> Docs in SQL ORM

* fix faiss/milvus tests

* fix doc classifier test

* fix eval test

* fixing eval issues

* Add latest docstring and tutorial changes

* fix mypy

* WIP replace dataclasses-json with manual serialization

* Add latest docstring and tutorial changes

* revert to dataclass-json serialization for now. remove debug prints.

* update docstrings

* fix extractor. fix Answer Span init

* fix api test

* keep meta data of answers in reader.run()

* fix meta handling

* adress review feedback

* Add latest docstring and tutorial changes

* make document=None for open domain labels

* add import

* fix print utils

* fix rest api

* adress review feedback

* Add latest docstring and tutorial changes

* fix mypy

Co-authored-by: Markus Paff <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
  • Loading branch information
3 people committed Oct 13, 2021
1 parent 9650f7a commit 4a6c930
Show file tree
Hide file tree
Showing 88 changed files with 1,530 additions and 1,086 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ jobs:
- uses: actions/checkout@v2
- uses: actions/setup-python@v2
with:
python-version: 3.7
python-version: 3.8
- name: Test with mypy
run: |
pip install mypy types-Markdown types-requests types-PyYAML
Expand Down
22 changes: 11 additions & 11 deletions docs/_src/api/api/document_store.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ Get documents from the document store.
#### get\_all\_labels\_aggregated

```python
| get_all_labels_aggregated(index: Optional[str] = None, filters: Optional[Dict[str, List[str]]] = None, open_domain: bool = True, aggregate_by_meta: Optional[Union[str, list]] = None) -> List[MultiLabel]
| get_all_labels_aggregated(index: Optional[str] = None, filters: Optional[Dict[str, List[str]]] = None, open_domain: bool = True, drop_negative_labels: bool = False, drop_no_answers: bool = False, aggregate_by_meta: Optional[Union[str, list]] = None) -> List[MultiLabel]
```

Return all labels in the DocumentStore, aggregated into MultiLabel objects.
Expand All @@ -88,6 +88,7 @@ object, provided that they have the same product_id (to be found in Label.meta["
When False, labels are aggregated in a closed domain fashion based on the question text
and also the id of the document that the label is tied to. In this setting, this function
might return multiple MultiLabel objects with the same question string.
:param TODO drop params
- `aggregate_by_meta`: The names of the Label meta fields by which to aggregate. For example: ["product_id"]

<a name="base.BaseDocumentStore.add_eval_data"></a>
Expand Down Expand Up @@ -131,7 +132,7 @@ class ElasticsearchDocumentStore(BaseDocumentStore)
#### \_\_init\_\_

```python
| __init__(host: Union[str, List[str]] = "localhost", port: Union[int, List[int]] = 9200, username: str = "", password: str = "", api_key_id: Optional[str] = None, api_key: Optional[str] = None, aws4auth=None, index: str = "document", label_index: str = "label", search_fields: Union[str, list] = "text", text_field: str = "text", name_field: str = "name", embedding_field: str = "embedding", embedding_dim: int = 768, custom_mapping: Optional[dict] = None, excluded_meta_data: Optional[list] = None, faq_question_field: Optional[str] = None, analyzer: str = "standard", scheme: str = "http", ca_certs: Optional[str] = None, verify_certs: bool = True, create_index: bool = True, refresh_type: str = "wait_for", similarity="dot_product", timeout=30, return_embedding: bool = False, duplicate_documents: str = 'overwrite', index_type: str = "flat")
| __init__(host: Union[str, List[str]] = "localhost", port: Union[int, List[int]] = 9200, username: str = "", password: str = "", api_key_id: Optional[str] = None, api_key: Optional[str] = None, aws4auth=None, index: str = "document", label_index: str = "label", search_fields: Union[str, list] = "content", content_field: str = "content", name_field: str = "name", embedding_field: str = "embedding", embedding_dim: int = 768, custom_mapping: Optional[dict] = None, excluded_meta_data: Optional[list] = None, analyzer: str = "standard", scheme: str = "http", ca_certs: Optional[str] = None, verify_certs: bool = True, create_index: bool = True, refresh_type: str = "wait_for", similarity="dot_product", timeout=30, return_embedding: bool = False, duplicate_documents: str = 'overwrite', index_type: str = "flat")
```

A DocumentStore using Elasticsearch to store and query the documents for our search.
Expand All @@ -152,7 +153,7 @@ A DocumentStore using Elasticsearch to store and query the documents for our sea
- `index`: Name of index in elasticsearch to use for storing the documents that we want to search. If not existing yet, we will create one.
- `label_index`: Name of index in elasticsearch to use for storing labels. If not existing yet, we will create one.
- `search_fields`: Name of fields used by ElasticsearchRetriever to find matches in the docs to our incoming query (using elastic's multi_match query), e.g. ["title", "full_text"]
- `text_field`: Name of field that might contain the answer and will therefore be passed to the Reader Model (e.g. "full_text").
- `content_field`: Name of field that might contain the answer and will therefore be passed to the Reader Model (e.g. "full_text").
If no Reader is used (e.g. in FAQ-Style QA) the plain content of this field will just be returned.
- `name_field`: Name of field that contains the title of the the doc
- `embedding_field`: Name of field containing an embedding vector (Only needed when using a dense retriever (e.g. DensePassageRetriever, EmbeddingRetriever) on top)
Expand Down Expand Up @@ -239,12 +240,12 @@ they will automatically get UUIDs assigned. See the `Document` class for details
**Arguments**:

- `documents`: a list of Python dictionaries or a list of Haystack Document objects.
For documents as dictionaries, the format is {"text": "<the-actual-text>"}.
Optionally: Include meta data via {"text": "<the-actual-text>",
For documents as dictionaries, the format is {"content": "<the-actual-text>"}.
Optionally: Include meta data via {"content": "<the-actual-text>",
"meta":{"name": "<some-document-name>, "author": "somebody", ...}}
It can be used for filtering and is accessible in the responses of the Finder.
Advanced: If you are using your own Elasticsearch mapping, the key names in the dictionary
should be changed to what you have set for self.text_field and self.name_field.
should be changed to what you have set for self.content_field and self.name_field.
- `index`: Elasticsearch index where the documents should be indexed. If not supplied, self.index will be used.
- `batch_size`: Number of documents that are passed to Elasticsearch's bulk function at a time.
- `duplicate_documents`: Handle duplicates document based on parameter options.
Expand Down Expand Up @@ -845,7 +846,7 @@ Return all labels in the document store
#### write\_documents

```python
| write_documents(documents: Union[List[dict], List[Document]], index: Optional[str] = None, batch_size: int = 10_000, duplicate_documents: Optional[str] = None)
| write_documents(documents: Union[List[dict], List[Document]], index: Optional[str] = None, batch_size: int = 10_000, duplicate_documents: Optional[str] = None) -> None
```

Indexes documents for later queries.
Expand Down Expand Up @@ -1037,7 +1038,7 @@ the vector embeddings are indexed in a FAISS Index.
#### write\_documents

```python
| write_documents(documents: Union[List[dict], List[Document]], index: Optional[str] = None, batch_size: int = 10_000, duplicate_documents: Optional[str] = None)
| write_documents(documents: Union[List[dict], List[Document]], index: Optional[str] = None, batch_size: int = 10_000, duplicate_documents: Optional[str] = None) -> None
```

Add new documents to the DocumentStore.
Expand Down Expand Up @@ -1561,7 +1562,7 @@ The current implementation is not supporting the storage of labels, so you canno
#### \_\_init\_\_

```python
| __init__(host: Union[str, List[str]] = "https://localhost", port: Union[int, List[int]] = 8080, timeout_config: tuple = (5, 15), username: str = None, password: str = None, index: str = "Document", embedding_dim: int = 768, text_field: str = "text", name_field: str = "name", faq_question_field="question", similarity: str = "dot_product", index_type: str = "hnsw", custom_schema: Optional[dict] = None, return_embedding: bool = False, embedding_field: str = "embedding", progress_bar: bool = True, duplicate_documents: str = 'overwrite', **kwargs, ,)
| __init__(host: Union[str, List[str]] = "https://localhost", port: Union[int, List[int]] = 8080, timeout_config: tuple = (5, 15), username: str = None, password: str = None, index: str = "Document", embedding_dim: int = 768, content_field: str = "content", name_field: str = "name", similarity: str = "dot_product", index_type: str = "hnsw", custom_schema: Optional[dict] = None, return_embedding: bool = False, embedding_field: str = "embedding", progress_bar: bool = True, duplicate_documents: str = 'overwrite', **kwargs, ,)
```

**Arguments**:
Expand All @@ -1574,10 +1575,9 @@ The current implementation is not supporting the storage of labels, so you canno
- `password`: password (standard authentication via http_auth)
- `index`: Index name for document text, embedding and metadata (in Weaviate terminology, this is a "Class" in Weaviate schema).
- `embedding_dim`: The embedding vector size. Default: 768.
- `text_field`: Name of field that might contain the answer and will therefore be passed to the Reader Model (e.g. "full_text").
- `content_field`: Name of field that might contain the answer and will therefore be passed to the Reader Model (e.g. "full_text").
If no Reader is used (e.g. in FAQ-Style QA) the plain content of this field will just be returned.
- `name_field`: Name of field that contains the title of the the doc
- `faq_question_field`: Name of field containing the question in case of FAQ-Style QA
- `similarity`: The similarity function used to compare document vectors. 'dot_product' is the default.
- `index_type`: Index type of any vector object defined in weaviate schema. The vector index type is pluggable.
Currently, HSNW is only supported.
Expand Down
2 changes: 1 addition & 1 deletion docs/_src/api/api/evaluation.md
Original file line number Diff line number Diff line change
Expand Up @@ -89,7 +89,7 @@ open vs closed domain eval (https://haystack.deepset.ai/tutorials/evaluation).
#### run

```python
| run(labels: List[Label], answers: List[dict], correct_retrieval: bool)
| run(labels: List[Label], answers: List[Answer], correct_retrieval: bool)
```

Run this node on one sample and its labels
Expand Down
4 changes: 2 additions & 2 deletions docs/_src/api/api/generator.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,7 @@ i.e. the model can easily adjust to domain documents even after training has fin
| 'meta': { 'doc_ids': [...],
| 'doc_scores': [80.42758 ...],
| 'doc_probabilities': [40.71379089355469, ...
| 'texts': ['Albert Einstein was a ...]
| 'content': ['Albert Einstein was a ...]
| 'titles': ['"Albert Einstein"', ...]
| }}]}
```
Expand Down Expand Up @@ -134,7 +134,7 @@ Generated answers plus additional infos in a dict like this:
| 'meta': { 'doc_ids': [...],
| 'doc_scores': [80.42758 ...],
| 'doc_probabilities': [40.71379089355469, ...
| 'texts': ['Albert Einstein was a ...]
| 'content': ['Albert Einstein was a ...]
| 'titles': ['"Albert Einstein"', ...]
| }}]}
```
Expand Down
2 changes: 1 addition & 1 deletion docs/_src/api/api/reader.md
Original file line number Diff line number Diff line change
Expand Up @@ -241,7 +241,7 @@ Returns a dict containing the following metrics:
#### eval

```python
| eval(document_store: BaseDocumentStore, device: Optional[str] = None, label_index: str = "label", doc_index: str = "eval_document", label_origin: str = "gold_label", calibrate_conf_scores: bool = False)
| eval(document_store: BaseDocumentStore, device: Optional[str] = None, label_index: str = "label", doc_index: str = "eval_document", label_origin: str = "gold-label", calibrate_conf_scores: bool = False)
```

Performs evaluation on evaluation documents in the DocumentStore.
Expand Down
6 changes: 3 additions & 3 deletions docs/_src/api/api/retriever.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ Wrapper method used to time functions.
#### eval

```python
| eval(label_index: str = "label", doc_index: str = "eval_document", label_origin: str = "gold_label", top_k: int = 10, open_domain: bool = False, return_preds: bool = False) -> dict
| eval(label_index: str = "label", doc_index: str = "eval_document", label_origin: str = "gold-label", top_k: int = 10, open_domain: bool = False, return_preds: bool = False) -> dict
```

Performs evaluation on the Retriever.
Expand Down Expand Up @@ -105,7 +105,7 @@ class ElasticsearchRetriever(BaseRetriever)
| "should": [{"multi_match": {
| "query": ${query}, // mandatory query placeholder
| "type": "most_fields",
| "fields": ["text", "title"]}}],
| "fields": ["content", "title"]}}],
| "filter": [ // optional custom filters
| {"terms": {"year": ${years}}},
| {"terms": {"quarter": ${quarters}}},
Expand Down Expand Up @@ -430,7 +430,7 @@ class EmbeddingRetriever(BaseRetriever)
**Arguments**:

- `document_store`: An instance of DocumentStore from which to retrieve documents.
- `embedding_model`: Local path or name of model in Hugging Face's model hub such as ``'deepset/sentence_bert'``
- `embedding_model`: Local path or name of model in Hugging Face's model hub such as ``'sentence-transformers/all-MiniLM-L6-v2'``
- `model_version`: The version of model to use from the HuggingFace model hub. Can be tag name, branch name, or commit hash.
- `use_gpu`: Whether to use gpu or not
- `model_format`: Name of framework that was used for saving the model. Options:
Expand Down
8 changes: 4 additions & 4 deletions docs/_src/api/api/translator.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ Abstract class for a Translator component that translates either a query or a do

```python
| @abstractmethod
| translate(query: Optional[str] = None, documents: Optional[Union[List[Document], List[str], List[Dict[str, Any]]]] = None, dict_key: Optional[str] = None) -> Union[str, List[Document], List[str], List[Dict[str, Any]]]
| translate(query: Optional[str] = None, documents: Optional[Union[List[Document], List[Answer], List[str], List[Dict[str, Any]]]] = None, dict_key: Optional[str] = None) -> Union[str, List[Document], List[Answer], List[str], List[Dict[str, Any]]]
```

Translate the passed query or a list of documents from language A to B.
Expand All @@ -24,7 +24,7 @@ Translate the passed query or a list of documents from language A to B.
#### run

```python
| run(query: Optional[str] = None, documents: Optional[Union[List[Document], List[str], List[Dict[str, Any]]]] = None, answers: Optional[Union[Dict[str, Any], List[Dict[str, Any]]]] = None, dict_key: Optional[str] = None)
| run(query: Optional[str] = None, documents: Optional[Union[List[Document], List[Answer], List[str], List[Dict[str, Any]]]] = None, answers: Optional[Union[Dict[str, Any], List[Dict[str, Any]]]] = None, dict_key: Optional[str] = None)
```

Method that gets executed when this class is used as a Node in a Haystack Pipeline
Expand Down Expand Up @@ -89,7 +89,7 @@ They also have a few multilingual models that support multiple languages at once
#### translate

```python
| translate(query: Optional[str] = None, documents: Optional[Union[List[Document], List[str], List[Dict[str, Any]]]] = None, dict_key: Optional[str] = None) -> Union[str, List[Document], List[str], List[Dict[str, Any]]]
| translate(query: Optional[str] = None, documents: Optional[Union[List[Document], List[Answer], List[str], List[Dict[str, Any]]]] = None, dict_key: Optional[str] = None) -> Union[str, List[Document], List[Answer], List[str], List[Dict[str, Any]]]
```

Run the actual translation. You can supply a query or a list of documents. Whatever is supplied will be translated.
Expand All @@ -98,5 +98,5 @@ Run the actual translation. You can supply a query or a list of documents. Whate

- `query`: The query string to translate
- `documents`: The documents to translate
- `dict_key`:
- `dict_key`: If you pass a dictionary in `documents`, you can specify here the field which shall be translated.

6 changes: 3 additions & 3 deletions docs/_src/tutorials/tutorials/13.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,9 +72,9 @@ text1 = "Python is an interpreted, high-level, general-purpose programming langu
text2 = "Princess Arya Stark is the third child and second daughter of Lord Eddard Stark and his wife, Lady Catelyn Stark. She is the sister of the incumbent Westerosi monarchs, Sansa, Queen in the North, and Brandon, King of the Andals and the First Men. After narrowly escaping the persecution of House Stark by House Lannister, Arya is trained as a Faceless Man at the House of Black and White in Braavos, using her abilities to avenge her family. Upon her return to Westeros, she exacts retribution for the Red Wedding by exterminating the Frey male line."
text3 = "Dry Cleaning are an English post-punk band who formed in South London in 2018.[3] The band is composed of vocalist Florence Shaw, guitarist Tom Dowse, bassist Lewis Maynard and drummer Nick Buxton. They are noted for their use of spoken word primarily in lieu of sung vocals, as well as their unconventional lyrics. Their musical stylings have been compared to Wire, Magazine and Joy Division.[4] The band released their debut single, 'Magic of Meghan' in 2019. Shaw wrote the song after going through a break-up and moving out of her former partner's apartment the same day that Meghan Markle and Prince Harry announced they were engaged.[5] This was followed by the release of two EPs that year: Sweet Princess in August and Boundary Road Snacks and Drinks in October. The band were included as part of the NME 100 of 2020,[6] as well as DIY magazine's Class of 2020.[7] The band signed to 4AD in late 2020 and shared a new single, 'Scratchcard Lanyard'.[8] In February 2021, the band shared details of their debut studio album, New Long Leg. They also shared the single 'Strong Feelings'.[9] The album, which was produced by John Parish, was released on 2 April 2021.[10]"

docs = [{"text": text1},
{"text": text2},
{"text": text3}]
docs = [{"content": text1},
{"content": text2},
{"content": text3}]

# Initialize document store and write in the documents
document_store = ElasticsearchDocumentStore()
Expand Down
4 changes: 2 additions & 2 deletions docs/_src/tutorials/tutorials/5.md
Original file line number Diff line number Diff line change
Expand Up @@ -119,7 +119,7 @@ document_store.add_eval_data(
)

# Let's prepare the labels that we need for the retriever and the reader
labels = document_store.get_all_labels_aggregated(index=label_index)
labels = document_store.get_all_labels_aggregated(index=label_index, drop_negative_labels=True, drop_no_answers=False)
```

## Initialize components of QA-System
Expand Down Expand Up @@ -220,7 +220,7 @@ results = []
# This is how to run the pipeline
for l in labels:
res = p.run(
query=l.question,
query=l.query,
labels=l,
params={"index": doc_index, "Retriever": {"top_k": 10}, "Reader": {"top_k": 5}},
)
Expand Down
2 changes: 1 addition & 1 deletion docs/_src/tutorials/tutorials/7.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@ documents: List[Document] = []
for title, text in zip(titles, texts):
documents.append(
Document(
text=text,
content=text,
meta={
"name": title or ""
}
Expand Down
12 changes: 6 additions & 6 deletions docs/_src/tutorials/tutorials/8.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ Ultimately, Haystack expects data to be provided as a list documents in the foll
``` python
docs = [
{
'text': DOCUMENT_TEXT_HERE,
'content': DOCUMENT_TEXT_HERE,
'meta': {'name': DOCUMENT_NAME, ...}
}, ...
]
Expand Down Expand Up @@ -144,11 +144,11 @@ preprocessor_nrsb = PreProcessor(split_respect_sentence_boundary=False)
docs_nrsb = preprocessor_nrsb.process(doc_txt)

print("RESPECTING SENTENCE BOUNDARY")
end_text = docs_default[0]["text"][-50:]
end_text = docs_default[0]["content"][-50:]
print("End of document: \"..." + end_text + "\"")
print()
print("NOT RESPECTING SENTENCE BOUNDARY")
end_text_nrsb = docs_nrsb[0]["text"][-50:]
end_text_nrsb = docs_nrsb[0]["content"][-50:]
print("End of document: \"..." + end_text_nrsb + "\"")
```

Expand All @@ -173,9 +173,9 @@ preprocessor_sliding_window = PreProcessor(
)
docs_sliding_window = preprocessor_sliding_window.process(doc_txt)

doc1 = docs_sliding_window[0]["text"][:200]
doc2 = docs_sliding_window[1]["text"][:100]
doc3 = docs_sliding_window[2]["text"][:100]
doc1 = docs_sliding_window[0]["content"][:200]
doc2 = docs_sliding_window[1]["content"][:100]
doc3 = docs_sliding_window[2]["content"][:100]

print("Document 1: \"" + doc1 + "...\"")
print("Document 2: \"" + doc2 + "...\"")
Expand Down
3 changes: 1 addition & 2 deletions docs/v0.8.0/_src/tutorials/tutorials/5.md
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,6 @@ document_store = ElasticsearchDocumentStore(host="localhost", username="", passw
embedding_dim=768, excluded_meta_data=["emb"])
```


```python
from haystack.preprocessor import PreProcessor

Expand All @@ -118,7 +117,7 @@ document_store.add_eval_data(
# Let's prepare the labels that we need for the retriever and the reader
labels = document_store.get_all_labels_aggregated(index=label_index)
q_to_l_dict = {
l.question: {
l.query: {
"retriever": l,
"reader": l
} for l in labels
Expand Down
Loading

0 comments on commit 4a6c930

Please sign in to comment.