Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Update searchapi format, default to Google, allow search engine selection #7453

Merged
merged 4 commits into from
Apr 3, 2024
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 6 additions & 6 deletions haystack/components/websearch/searchapi.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@
import json
from typing import Any, Dict, List, Optional, Union

import requests
Expand All @@ -21,8 +20,6 @@ class SearchApiWebSearch:
"""
Uses [SearchApi](https://www.searchapi.io/) to search the web for relevant documents.

See the [SearchApi website](https://www.searchapi.io/) for more details.

Usage example:
```python
from haystack.components.websearch import SearchApiWebSearch
Expand Down Expand Up @@ -50,12 +47,17 @@ def __init__(
:param search_params: Additional parameters passed to the SearchApi API.
For example, you can set 'num' to 100 to increase the number of search results.
See the [SearchApi website](https://www.searchapi.io/) for more details.

The default search engine is Google, however, users can change it by setting the `engine`
parameter in the `search_params`.
"""

self.api_key = api_key
self.top_k = top_k
self.allowed_domains = allowed_domains
self.search_params = search_params or {}
if "engine" not in self.search_params:
self.search_params["engine"] = "google"

# Ensure that the API key is resolved.
_ = self.api_key.resolve_value()
Expand Down Expand Up @@ -101,10 +103,8 @@ def run(self, query: str) -> Dict[str, Union[List[Document], List[str]]]:
:raises SearchApiError: If an error occurs while querying the SearchApi API.
"""
query_prepend = "OR ".join(f"site:{domain} " for domain in self.allowed_domains) if self.allowed_domains else ""

payload = json.dumps({"q": query_prepend + " " + query, **self.search_params})
payload = {"q": query_prepend + " " + query, **self.search_params}
headers = {"Authorization": f"Bearer {self.api_key.resolve_value()}", "X-SearchApi-Source": "Haystack"}

anakin87 marked this conversation as resolved.
Show resolved Hide resolved
try:
response = requests.get(SEARCHAPI_BASE_URL, headers=headers, params=payload, timeout=90)
response.raise_for_status() # Will raise an HTTPError for bad responses
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
fixes:
- |
Updated the SearchApiWebSearch component with new search format and allowed users to specify the search engine via the `engine`
parameter in `search_params`. The default search engine is Google, making it easier for users to tailor their web searches.
Loading