fix: warning if doc store similarity function is incompatible with Sentence Transformers model #3455

anakin87 · 2022-10-23T21:16:18Z

Related Issues

fixes Update warning for dot_product when using sentence transformers #3436

Proposed Changes:

Check that Document store uses a similarity function compatible with the Sentence Transformers embedding model
(we can deduce it from the model name).
If the similarity function is incompatible, show a warning.

How did you test it?

Some manual tests

Notes for the reviewer

The topic of the Sentence Transformers nomenclature is a bit tricky.
So, if there is "-cos-" in the model name, we can undoubtedly say that the model is compatible with the cosine similarity, but it might also be compatible with the dot_product.
The solution I propose seems to me to be a good compromise, while the current warning is misleading.

Checklist

I have read the contributors guidelines and the code of conduct
I have updated the related issue with new insights and changes
I added tests that demonstrate the correct behavior of the change
I've used the conventional commit convention for my PR title
I documented my code
I ran pre-commit hooks and fixed any issue

ZanSara · 2022-10-24T13:52:17Z

In general this change is fine, but I wonder about the value of this check. If there's no established nomenclature for recommended similarity metric in the model name, isn't this system prone to false positives and similar spurious warnings? I'd rather have no warning at all than one that risks misleading the users. @tholor what do you think?

tholor

Thanks for working on this @anakin87 !

I think you made a pragmatic choice here of only raising the warning when the similarity metric is clearly mentioned in the model name.

@ZanSara I really like that you think about user value here. I would argue though that with the current implementation we won't see many false positives. The models that state -cos- or -dot- are pretty clearly using those similarity functions. For all other models (probably the majority) we won't raise any warning.
Raising the warning in the few cases is still pretty valuable as it was a big source of errors from users in the past and you won't really notice this as a user as it's only failing silently (= lower performance).

ZanSara · 2022-10-25T15:00:28Z

@tholor sounds good, thank you for the input 😊 Merging!

…ntence Transformers model (#3455) * check_docstore_similarity_function * remove import

anakin87 added 2 commits October 23, 2022 23:03

check_docstore_similarity_function

f0407c6

remove import

3955010

anakin87 requested a review from a team as a code owner October 23, 2022 21:16

anakin87 requested review from masci and removed request for a team October 23, 2022 21:16

ZanSara added type:feature New feature or request topic:modeling type:refactor Not necessarily visible to the users topic:retriever labels Oct 24, 2022

ZanSara requested review from ZanSara and removed request for masci October 24, 2022 13:52

tholor approved these changes Oct 25, 2022

View reviewed changes

ZanSara merged commit a2d459d into deepset-ai:main Oct 25, 2022

ZanSara removed their request for review October 25, 2022 15:01

anakin87 deleted the better_sentencetransformers_similarity_warning branch October 25, 2022 15:05

sjrl pushed a commit that referenced this pull request Oct 25, 2022

fix: warning if doc store similarity function is incompatible with Se…

84ca0ca

…ntence Transformers model (#3455) * check_docstore_similarity_function * remove import

ZanSara mentioned this pull request Oct 26, 2022

feat: add document_store to all BaseRetriever.retrieve() and BaseRetriever.retrieve_batch() implementations #3379

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: warning if doc store similarity function is incompatible with Sentence Transformers model #3455

fix: warning if doc store similarity function is incompatible with Sentence Transformers model #3455

anakin87 commented Oct 23, 2022

ZanSara commented Oct 24, 2022

tholor left a comment

ZanSara commented Oct 25, 2022

fix: warning if doc store similarity function is incompatible with Sentence Transformers model #3455

fix: warning if doc store similarity function is incompatible with Sentence Transformers model #3455

Conversation

anakin87 commented Oct 23, 2022

Related Issues

Proposed Changes:

How did you test it?

Notes for the reviewer

Checklist

ZanSara commented Oct 24, 2022

tholor left a comment

Choose a reason for hiding this comment

ZanSara commented Oct 25, 2022