Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Update searchapi format, default to Google, allow search engine selection #7453

Merged
merged 4 commits into from
Apr 3, 2024

Conversation

vblagoje
Copy link
Member

@vblagoje vblagoje commented Apr 2, 2024

Why:

The update in the web search API component focuses on updating to a new format and also enhancing flexibility. Specifically, it addresses the need for users to specify their preferred search engine when conducting web searches through the API.

What:

  • Extended the documentation to inform users they can now specify a search engine via the search_params argument, with Google being the default.
  • Altered the payload creation process by:
    • Checking if the engine parameter is not present in search_params and setting it to "google" by default.
    • Removing the JSON serialization process for the payload variable.
    • Adjusting the way the API key is included in the request, moving it from the request headers to the payload.
    • Simplifying the request by eliminating custom headers, relying instead on default headers and passing the payload as parameters.

How can it be used:

  • To specify a search engine other than Google, users can now adjust the search_params like so:
from haystack.components.websearch import SearchApiWebSearch, SerperDevWebSearch
from haystack.utils import Secret

web_search = SearchApiWebSearch(api_key=Secret.from_token("YOUR-API-KEY"), search_params={"engine": "bing"})
query = "What is the capital of Germany?"

response = web_search.run(query)
print(response)
  • This update maintains backward compatibility while allowing for new feature usage without additional overhead for users preferring the default settings.

How did you test it:

  • Manually tested the updated component with different search engine parameters to ensure the correct engine is utilized.
  • Conducted unit tests to verify that the default search engine is Google when no engine is specified.

Notes for the reviewer:

  • Review the updated documentation to ensure it clearly communicates the new feature and its default behavior.
  • Verify that the removal of custom headers does not impact the API's authentication process or response handling.

@vblagoje vblagoje requested a review from a team as a code owner April 2, 2024 13:22
@vblagoje vblagoje requested review from anakin87 and removed request for a team April 2, 2024 13:22
@github-actions github-actions bot added 2.x Related to Haystack v2.0 type:documentation Improvements on the docs labels Apr 2, 2024
@vblagoje vblagoje requested a review from a team as a code owner April 2, 2024 13:23
@vblagoje vblagoje requested review from dfokina and removed request for a team April 2, 2024 13:23
@coveralls
Copy link
Collaborator

coveralls commented Apr 2, 2024

Pull Request Test Coverage Report for Build 8535435967

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • 2 unchanged lines in 1 file lost coverage.
  • Overall coverage increased (+0.002%) to 89.48%

Files with Coverage Reduction New Missed Lines %
components/websearch/searchapi.py 2 96.36%
Totals Coverage Status
Change from base Build 8522427635: 0.002%
Covered Lines: 5571
Relevant Lines: 6226

💛 - Coveralls

Copy link
Member

@anakin87 anakin87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The approach looks correct.
I left some comments.

Comment on lines 104 to 105
if "engine" not in self.search_params:
self.search_params["engine"] = "google"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer setting this in __init__. This would also ensure that this default is properly serialized.

Comment on lines 23 to 24
See the [SearchApi website](https://www.searchapi.io/) for more details. The default search engine is Google,
however, users can change it by setting the `engine` parameter in the `search_params`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add this information to the __init__ docstring.

haystack/components/websearch/searchapi.py Show resolved Hide resolved
@Croccodoyle
Copy link

Working me for me now. Thanks

@vblagoje
Copy link
Member Author

vblagoje commented Apr 3, 2024

Thanks for the feedback @anakin87 , confirmed all works as well with your suggestions. Please have another look

Copy link
Member

@anakin87 anakin87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@vblagoje vblagoje merged commit d83af92 into main Apr 3, 2024
21 checks passed
@vblagoje vblagoje deleted the searchapi_fix branch April 3, 2024 08:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.x Related to Haystack v2.0 topic:tests type:documentation Improvements on the docs
Projects
None yet
4 participants