Simplify tests & allow running on individual doc stores #1487

tholor · 2021-09-22T07:32:00Z

Proposed changes:
Running the tests locally is currently quite cumbersome as you need to start various external docker containers for elasticsearch, milvus etc.

Let's simplify this and extend the option that was introduced in #1202, to run tests only on a subset or single documentstore.
This means, when calling pytest . --doc_store_type="memory" we will run all tests that can be run just with the InMemoryDocumentStore, i.e.:

all the tests that we typically run on the whole "document store grid" will only be run for InMemoryDocumentStore
any test that is specific to other document stores (e.g. elasticsearch) and is not supported by the chosen document store will be skipped (and marked in the logs accordingly)

New conventions for creating new tests:

Test should run on all document stores / those supplied in the CLI arg `--doc_store_type`:

Use one of the fixtures document_store or document_store_with_docs or document_store_type.
Do not parameterize it yourself.

Example:

def test_write_with_duplicate_doc_ids(document_store):
        ....

Test is only compatible with certain document stores:

Some tests you don't want to run on all possible document stores. Either because the test is specific to one/few doc store(s) or the test is not really document store related and it's enough to test it on one document store and speed up the execution time.

Example:

# Currently update_document_meta() is not implemented for InMemoryDocStore so it's not listed here as an option

@pytest.mark.parametrize("document_store", ["elasticsearch", "faiss"], indirect=True)
def test_update_meta(document_store):
    ....

Test is not using a `document_store`/ fixture, but still has a hard requirement for a certain document store:

Example:

@pytest.mark.elasticsearch
def test_elasticsearch_custom_fields(elasticsearch_fixture):
    client = Elasticsearch()
    client.indices.delete(index='haystack_test_custom', ignore=[404])
    document_store = ElasticsearchDocumentStore(index="haystack_test_custom", text_field="custom_text_field",
                                                embedding_field="custom_embedding_field")

Limitations / future work:

As of now, you will still need to launch the external document stores yourself, in a separate PR we could also automate this. We can probably get rid of the document stores fixtures in conftest.py and move the launch logic there into get_document_store()
We should make weaviate fully compliant so that we can get rid of the "special tests" in test_weaviate.py and rather add it as another parametrization value in test_document_store.py (see comment)
Revisit which tests really need the option to run on the whole document store grid, and where we can maybe just reduce it to InMemoryDocumentStore to speed up the CI
We currently have quite many integration tests, more unit tests would be helpful and could speed things up
There was recently a MockRetriever introduced in test_faiss_cosine_similarity. We can check if we can use this one in more places and in general mock more of the expensive components.

Status (please check what you already did):

First draft (up for discussions & feedback)
Final code
Added tests
Updated documentation

… test cases.

…simplify_tests

test/conftest.py

test/test_generator.py

tholor and others added 12 commits September 22, 2021 09:23

simplify tests for individual doc stores

f7c8dfd

WIP refactoring markers of tests

d2c4fd4

test alternative approach for tests with existing parametrization

625d4db

fix skip logic of already parametrized tests

f02f44a

fix weaviate behaviour in tests - not parametrizing it in our general…

b7c6e88

… test cases.

Add latest docstring and tutorial changes

c442cb7

fix some tests

9b6b3e5

Merge branch 'simplify_tests' of github.com:deepset-ai/haystack into …

81be94e

…simplify_tests

remove sql from document_store_types

535d7b8

fix markers for generator and pipeline test

a95523c

remove inmemory marker

8449517

remove unneeded elasticsearch markers

18b74ad

tholor requested a review from oryx1729 September 22, 2021 15:30

tholor added 2 commits September 23, 2021 08:52

update readme and contributing.md

ea841a4

update contributing

ce42682

tholor added the topic:tests label Sep 23, 2021

lalitpagaria reviewed Sep 23, 2021

View reviewed changes

test/conftest.py Show resolved Hide resolved

adjust example

1b3b899

oryx1729 reviewed Sep 27, 2021

View reviewed changes

test/test_generator.py Show resolved Hide resolved

oryx1729 approved these changes Sep 27, 2021

View reviewed changes

tholor merged commit 183fd5a into master Sep 27, 2021

tholor deleted the simplify_tests branch September 27, 2021 08:52

This was referenced Sep 28, 2021

Simplify setup of local tests & update contributor docs #1353

Closed

Fix document_store_type flag for tests with multiple fixtures #1526

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplify tests & allow running on individual doc stores #1487

Simplify tests & allow running on individual doc stores #1487

tholor commented Sep 22, 2021 •

edited

Loading

Simplify tests & allow running on individual doc stores #1487

Simplify tests & allow running on individual doc stores #1487

Conversation

tholor commented Sep 22, 2021 • edited Loading

New conventions for creating new tests:

Test should run on all document stores / those supplied in the CLI arg --doc_store_type:

Test is only compatible with certain document stores:

Test is not using a document_store/ fixture, but still has a hard requirement for a certain document store:

Limitations / future work:

tholor commented Sep 22, 2021 •

edited

Loading

Test should run on all document stores / those supplied in the CLI arg `--doc_store_type`:

Test is not using a `document_store`/ fixture, but still has a hard requirement for a certain document store: