Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weaviate: enable BM25 queries with filters #3553

Closed
masci opened this issue Nov 11, 2022 · 9 comments · Fixed by #3628
Closed

Weaviate: enable BM25 queries with filters #3553

masci opened this issue Nov 11, 2022 · 9 comments · Fixed by #3628
Assignees

Comments

@masci
Copy link
Contributor

masci commented Nov 11, 2022

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
When BM25 support was firstly introduced in 1.14, Weaviate didn't support passing filters along with the query and Haystack raises an exception when you call query() with both a query text and some filters.

Describe the solution you'd like
This is now supported in 1.16 and we should amend this part of the code

# Once Weaviate starts supporting filters with BM25:
# filter_dict = LogicalFilterClause.parse(filters).convert_to_weaviate()
# gql_query = weaviate.gql.get.GetBuilder(class_name=index,
# properties=properties,
# connection=self.weaviate_client) \
# .with_near_vector({'vector': [0, 0]}) \
# .with_where(filter_dict) \
# .with_limit(top_k) \
# .build()

@zoltan-fedor
Copy link
Contributor

zoltan-fedor commented Nov 29, 2022

Hi @masci,
I was the one who originally wrote the Haystack code for the BM25 Weaviate support and added that exception about not having filter support with BM25 in Weaviate.

Unfortunately the filter support with BM25 is still NOT in Weaviate today (v1.16.5), as it was confirmed by testing and the Weaviate core devs.

In July 2022 I have raised a Weaviate issue for this being added (see), but that just got re-raised by a Weaviate core dev today: weaviate/weaviate#2393

Please vote for this issue in Weaviate, so hopefully they can add the filter support for BM25 in Weaviate soon:
weaviate/weaviate#2393

Once that is added, then Haystack can be made to support that, but obv. not before that unfortunately.

@ZanSara
Copy link
Contributor

ZanSara commented Nov 30, 2022

Hey @zoltan-fedor ! Thank you for the heads-up! There must be an issue with our test suite then. I'll perform some manual tests to see what's happening.

@ZanSara ZanSara reopened this Nov 30, 2022
@zoltan-fedor
Copy link
Contributor

zoltan-fedor commented Nov 30, 2022

Hey @ZanSara,
I have already looked into that and explained at #3628 (comment)

But in summary there were two issues:

  1. This line wasn't put under the else: of the uncommented if -- which resulted in the query WITHOUT the filter being executed instead of the filtered one.
  2. Even if the filtered query would have been executed, this test would not have caught it, as it is NOT checking for the number of returned records nor whether Weaviate has returned an error.
    With the filtered query it currently returns an error from Weaviate with zero results, but that Weaviate errors returned do not result in Haystack errors being thrown, so without checking the number of results returned the error is basically silently ignored.
    One enhancement of the Weaviate integration could be to always check whether the Weaviate response contains an error and throw an exception if it does - that would have prevented this silent error issue here. I can make a PR for that if you want.

@zoltan-fedor
Copy link
Contributor

FYI - thanks to support shown by the community the Weaviate dev team has just scheduled the delivery of this functionality (filter support for BM25 search) for v1.18 - so likely around mid-to-late Q1 2023.
See weaviate/weaviate#2393 (comment)

@masci
Copy link
Contributor Author

masci commented Dec 1, 2022

@zoltan-fedor thanks a lot for reaching out with all these detailed reports! Together with @ZanSara we traced back the problem, a bit crazy how this happened with several tiny mistakes lining up perfectly to cause a bigger one.

On a positive note, it's great to see the Haystack community in action 🙌

@krlng
Copy link

krlng commented Apr 5, 2023

As 1.18 is out, I wanted to give this some activity. Would this work now @zoltan-fedor ?

@krlng
Copy link

krlng commented Apr 5, 2023

@zoltan-fedor
Copy link
Contributor

@krlng,
that's correct, this has already been solved by that PR you have linked.

@anakin87
Copy link
Member

done in #4385

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants