Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SearchApiWebSearch in websearch is not passing query correctly as per v2 example documentation. https://docs.haystack.deepset.ai/docs/searchapiwebsearch #7447

Closed
1 task
Croccodoyle opened this issue Apr 1, 2024 · 7 comments · Fixed by #7453
Assignees
Labels
2.x Related to Haystack v2.0 P1 High priority, add to the next sprint

Comments

@Croccodoyle
Copy link

Describe the bug
I am running the code from the example URL and am hitting an error that seems to indicate that the SearchApiWebSearch is call is being passed the query formatted as a JSON but is expecting text. Example code is provided here.... https://docs.haystack.deepset.ai/reference/websearch-api

Error message
Traceback (most recent call last):
File "/Users/xxxxxxx/Scratch/Streamlit/haystack_env/lib/python3.9/site-packages/haystack/components/websearch/searchapi.py", line 110, in run
response.raise_for_status() # Will raise an HTTPError for bad responses
File "/Users/xxxxxxx/Scratch/Streamlit/haystack_env/lib/python3.9/site-packages/requests/models.py", line 1021, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: https://www.searchapi.io/api/v1/search?%7B%22q%22:%20%22%20What%20is%20the%20most%20famous%20landmark%20in%20Berlin?%22%7D

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/Users/xxxxxxxx/Scratch/Streamlit/websearch.py", line 40, in
pipe.run(data={"search":{"query": query_text}, "prompt_builder":{"query": query_text}})
File "/Users/xxxxxxxxxxxxx/Scratch/Streamlit/haystack_env/lib/python3.9/site-packages/haystack/core/pipeline/pipeline.py", line 823, in run
res = comp.run(**last_inputs[name])
File "/Users/xxxxxxxxxx/Scratch/Streamlit/haystack_env/lib/python3.9/site-packages/haystack/components/websearch/searchapi.py", line 115, in run
raise SearchApiError(f"An error occurred while querying {self.class.name}. Error: {e}") from e
haystack.components.websearch.searchapi.SearchApiError: An error occurred while querying SearchApiWebSearch. Error: 400 Client Error: Bad Request for url: https://www.searchapi.io/api/v1/search?%7B%22q%22:%20%22%20What%20is%20the%20most%20famous%20landmark%20in%20Berlin?%22%7D

Expected behavior

Expected the correct response to be returned from webearch API

A clear and concise description of what you expected to happen.

Additional context
From ChatGPT:
The error you're encountering is a clear indication that there's a formatting issue with how the query is being passed to the SearchApiWebSearch component in Haystack. The URL suggests that the query parameters are not being properly formatted for a standard HTTP GET request. The API is receiving the query wrapped in URL-encoded JSON (%7B and %7D are the encoded braces { and }), which is likely not supported by the searchapi.io API for query parameters.

The root cause seems to be in how the search component of your pipeline or the SearchApiWebSearch component itself formats the outgoing HTTP request. The searchapi.io service expects query parameters to be in a standard key=value format, appended to the query string of the URL, not as a JSON object.

To Reproduce
run example using haystack-ai

FAQ Check

System:

  • OS: Mac Sonoma 14.3.1
  • GPU/CPU: M1 Pro
  • Haystack version (commit or version number): 2.0.0
  • DocumentStore: N/A
  • Reader: N/A
  • Retriever: N/A
@julian-risch julian-risch added P1 High priority, add to the next sprint 2.x Related to Haystack v2.0 labels Apr 2, 2024
@vblagoje
Copy link
Member

vblagoje commented Apr 2, 2024

@Croccodoyle this must have been some transient error - serperdev is a google search engine bridge and it can become unavailable. I've just tried this example and it works fine. Also make sure to use a valid api key.

@vblagoje vblagoje closed this as completed Apr 2, 2024
@Croccodoyle
Copy link
Author

Croccodoyle commented Apr 2, 2024

Did you use haystack-ai? The error is definitely there... My code......

from haystack.components.websearch import SearchApiWebSearch
from haystack.utils import Secret

web_search = SearchApiWebSearch(api_key=Secret.from_token("xxxxxxxxxxxxxx"))
query = "What is the capital of Germany?"

response = web_search.run(query)

ERROR:

Traceback (most recent call last):
File "/Users/xxxxxxxxxxx/Scratch/Streamlit/haystack_env/lib/python3.9/site-packages/haystack/components/websearch/searchapi.py", line 110, in run
response.raise_for_status() # Will raise an HTTPError for bad responses
File "/Users/xxxxxxxxx/Scratch/Streamlit/haystack_env/lib/python3.9/site-packages/requests/models.py", line 1021, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: https://www.searchapi.io/api/v1/search?%7B%22q%22:%20%22%20What%20is%20the%20capital%20of%20Germany?%22%7D

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/Users/xxxxxx/Scratch/Streamlit/websearch.py", line 18, in
response = web_search.run(query)
File "/Users/xxxxxxxx/Scratch/Streamlit/haystack_env/lib/python3.9/site-packages/haystack/components/websearch/searchapi.py", line 115, in run
raise SearchApiError(f"An error occurred while querying {self.class.name}. Error: {e}") from e
haystack.components.websearch.searchapi.SearchApiError: An error occurred while querying SearchApiWebSearch. Error: 400 Client Error: Bad Request for url: https://www.searchapi.io/api/v1/search?%7B%22q%22:%20%22%20What%20is%20the%20capital%20of%20Germany?%22%7D

@vblagoje vblagoje reopened this Apr 2, 2024
@vblagoje
Copy link
Member

vblagoje commented Apr 2, 2024

You are right @Croccodoyle upon closer inspection something is off with SerpAPI, seems like they change endpoint for their service. I was checking SerperDev erroneously. Looking into this one now 🙏

@Croccodoyle
Copy link
Author

Thank you, much appreciated

@vblagoje
Copy link
Member

vblagoje commented Apr 2, 2024

Ok @Croccodoyle , searchapi.io has changed the format of their programmatic queries. Watch this issue for upcoming fixes.

@vblagoje
Copy link
Member

vblagoje commented Apr 2, 2024

@Croccodoyle if you are eager to try out the fix, try it from #7453 and see if works for you.

@Croccodoyle
Copy link
Author

Confirmed working and many thanks!!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.x Related to Haystack v2.0 P1 High priority, add to the next sprint
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants