Dense Passage Retrieval (ConnectionTimeoutError) #644

Freemanlabs · 2020-12-01T09:31:36Z

With the same settings for Elasticsearch, I can successfully retrieve prediction answers from BM25 and TFIDF, However, When I try with DPR, I get ConnectionTimeoutError

How do I resolve this?

The text was updated successfully, but these errors were encountered:

Freemanlabs · 2020-12-02T08:34:05Z

First I get RequestError: 400... if I try it again, then I see ConnectionTimeoutError

tholor · 2020-12-02T09:58:07Z

Please try to increase the timeout for elasticsearch via:

es_store = ElasticsearchDocumentStore(..., timeout=3000)

Are you running this on Colab? Elasticsearch might be very slow there as Colab only provides a single CPU core...

Freemanlabs · 2020-12-02T10:03:27Z

I see. I will try that. Although, my Runtime type is GPU. meaning I am using a GPU on Colab...

Also, do you have any response to why I get this RequestError: 400... initially?

tholor · 2020-12-02T10:16:29Z

Although, my Runtime type is GPU

Elasticsearch can not benefit from GPU and will always run just on CPU.

do you have any response to why I get this RequestError: 400...

Can you please provide more context / a script to reproduce this error? My first intuition would be that elasticsearch is probably still starting / or still busy with indexing the added documents / embeddings.

Freemanlabs · 2020-12-02T18:17:48Z

this is the line causing issues
prediction = finder.get_answers(question=que["question"], top_k_retriever=10, top_k_reader=5)

This is the error I get when I run the above line for the firstime after starting and conneting to my server
RequestError: RequestError(400, 'search_phase_execution_exception', 'runtime error')

Then after I get this error, I simply run the cell again, then I get this
ConnectionTimeout: ConnectionTimeout caused by - ReadTimeoutError(HTTPConnectionPool(host='localhost', port=9200): Read timed out. (read timeout=300))

I did as you directed timeout=3000. still no luck

lalitpagaria · 2020-12-02T19:08:22Z

There is possibility that any one of following may cause it (elastic/elasticsearch#8084) -

Low timeout
Low disk space
Two fields with the same name
Querying with null in filter (This is unlikely as call to get_answers dont have any filter parameter)

Could you please share end to end logs to debug more.

Freemanlabs · 2020-12-02T19:36:56Z

My timeout was increased from 30 (default) to 3000.
I currently have 37/68 showing on my colab environment
Two fields with the same name: I don't know how to debug this. I don't even understand how it applies to my problem

Logs: This is all I see (from colab):

`timeout Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py in _make_request(self, conn, method, url, timeout, chunked, **httplib_request_kw)
420 # Otherwise it looks like a bug in the code.
--> 421 six.raise_from(e, None)
422 except (SocketTimeout, BaseSSLError, SocketError) as e:

21 frames
timeout: timed out

During handling of the above exception, another exception occurred:

ReadTimeoutError Traceback (most recent call last)
ReadTimeoutError: HTTPConnectionPool(host='localhost', port=9200): Read timed out. (read timeout=300)

During handling of the above exception, another exception occurred:

ConnectionTimeout Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/elasticsearch/connection/http_urllib3.py in perform_request(self, method, url, params, body, timeout, ignore, headers)
255 raise SSLError("N/A", str(e), e)
256 if isinstance(e, ReadTimeoutError):
--> 257 raise ConnectionTimeout("TIMEOUT", str(e), e)
258 raise ConnectionError("N/A", str(e), e)
259

ConnectionTimeout: ConnectionTimeout caused by - ReadTimeoutError(HTTPConnectionPool(host='localhost', port=9200): Read timed out. (read timeout=300))`

lalitpagaria · 2020-12-02T19:40:47Z

@Freemanlabs Thank you for update.
Could you please expend these 21 frames and share full stack-trace.

ConnectionTimeout: ConnectionTimeout caused by - ReadTimeoutError(HTTPConnectionPool(host='localhost', port=9200): Read timed out. (read timeout=300))`

Also in some code flow you timeout is still 300 not 3000.

lalitpagaria · 2020-12-02T19:42:23Z

Also be aware that Google colab have disk limitation of 108GB out of it only 75GB is available to user.
https://neptune.ai/blog/google-colab-dealing-with-files#:~:text=Also%2C%20Colab%20has%20a%20disk,like%20image%20or%20video%20data.

Freemanlabs · 2020-12-02T19:51:47Z

Could you please expend these 21 frames and share full stack-trace.

Expanded frames:

`/usr/local/lib/python3.6/dist-packages/urllib3/packages/six.py in raise_from(value, from_value)

/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py in _make_request(self, conn, method, url, timeout, chunked, **httplib_request_kw)
415 try:
--> 416 httplib_response = conn.getresponse()
417 except BaseException as e:

/usr/lib/python3.6/http/client.py in getresponse(self)
1372 try:
-> 1373 response.begin()
1374 except ConnectionError:

/usr/lib/python3.6/http/client.py in begin(self)
310 while True:
--> 311 version, status, reason = self._read_status()
312 if status != CONTINUE:

/usr/lib/python3.6/http/client.py in _read_status(self)
271 def _read_status(self):
--> 272 line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
273 if len(line) > _MAXLINE:

/usr/lib/python3.6/socket.py in readinto(self, b)
585 try:
--> 586 return self._sock.recv_into(b)
587 except timeout:

timeout: timed out

During handling of the above exception, another exception occurred:

ReadTimeoutError Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/elasticsearch/connection/http_urllib3.py in perform_request(self, method, url, params, body, timeout, ignore, headers)
245 response = self.pool.urlopen(
--> 246 method, url, body, retries=Retry(False), headers=request_headers, **kw
247 )

/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
719 retries = retries.increment(
--> 720 method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
721 )

/usr/local/lib/python3.6/dist-packages/urllib3/util/retry.py in increment(self, method, url, response, error, _pool, _stacktrace)
375 # Disabled, indicate to re-raise the error.
--> 376 raise six.reraise(type(error), error, _stacktrace)
377

/usr/local/lib/python3.6/dist-packages/urllib3/packages/six.py in reraise(tp, value, tb)
734 raise value.with_traceback(tb)
--> 735 raise value
736 finally:

/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
671 headers=headers,
--> 672 chunked=chunked,
673 )

/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py in _make_request(self, conn, method, url, timeout, chunked, **httplib_request_kw)
422 except (SocketTimeout, BaseSSLError, SocketError) as e:
--> 423 self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
424 raise

/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py in _raise_timeout(self, err, url, timeout_value)
330 raise ReadTimeoutError(
--> 331 self, url, "Read timed out. (read timeout=%s)" % timeout_value
332 )

ReadTimeoutError: HTTPConnectionPool(host='localhost', port=9200): Read timed out. (read timeout=300)

During handling of the above exception, another exception occurred:

ConnectionTimeout Traceback (most recent call last)
in ()
----> 1 prediction = dpr_finder.get_answers(question="who is who?", top_k_retriever=10, top_k_reader=5)

/usr/local/lib/python3.6/dist-packages/haystack/finder.py in get_answers(self, question, top_k_reader, top_k_retriever, filters, index)
67
68 # 1) Apply retriever(with optional filters) to get fast candidate documents
---> 69 documents = self.retriever.retrieve(question, filters=filters, top_k=top_k_retriever, index=index)
70 logger.info(f"Got {len(documents)} candidates from retriever")
71 logger.debug(f"Retrieved document IDs: {[doc.id for doc in documents]}")

/usr/local/lib/python3.6/dist-packages/haystack/retriever/dense.py in retrieve(self, query, filters, top_k, index)
140 index = self.document_store.index
141 query_emb = self.embed_queries(texts=[query])
--> 142 documents = self.document_store.query_by_embedding(query_emb=query_emb[0], top_k=top_k, filters=filters, index=index)
143 return documents
144

/usr/local/lib/python3.6/dist-packages/haystack/document_store/elasticsearch.py in query_by_embedding(self, query_emb, filters, top_k, index, return_embedding)
572
573 logger.debug(f"Retriever query: {body}")
--> 574 result = self.client.search(index=index, body=body, request_timeout=300)["hits"]["hits"]
575
576 documents = [self._convert_es_hit_to_document(hit, adapt_score_for_embedding=True) for hit in result]

/usr/local/lib/python3.6/dist-packages/elasticsearch/client/utils.py in _wrapped(*args, **kwargs)
150 if p in kwargs:
151 params[p] = kwargs.pop(p)
--> 152 return func(*args, params=params, headers=headers, **kwargs)
153
154 return _wrapped

/usr/local/lib/python3.6/dist-packages/elasticsearch/client/init.py in search(self, body, index, doc_type, params, headers)
1661 params=params,
1662 headers=headers,
-> 1663 body=body,
1664 )
1665

/usr/local/lib/python3.6/dist-packages/elasticsearch/transport.py in perform_request(self, method, url, headers, params, body)
390 raise e
391 else:
--> 392 raise e
393
394 else:

/usr/local/lib/python3.6/dist-packages/elasticsearch/transport.py in perform_request(self, method, url, headers, params, body)
363 headers=headers,
364 ignore=ignore,
--> 365 timeout=timeout,
366 )
367

/usr/local/lib/python3.6/dist-packages/elasticsearch/connection/http_urllib3.py in perform_request(self, method, url, params, body, timeout, ignore, headers)
255 raise SSLError("N/A", str(e), e)
256 if isinstance(e, ReadTimeoutError):
--> 257 raise ConnectionTimeout("TIMEOUT", str(e), e)
258 raise ConnectionError("N/A", str(e), e)
259

ConnectionTimeout: ConnectionTimeout caused by - ReadTimeoutError(HTTPConnectionPool(host='localhost', port=9200): Read timed out. (read timeout=300))`

Also in some code flow you timeout is still 300 not 3000

I don't understand why either.
From the documentation you see that timeout=30 so I do not know what 300 means

__init__(host: str = "localhost", port: int = 9200, username: str = "", password: str = "", index: str = "document", label_index: str = "label", search_fields: Union[str, list] = "text", text_field: str = "text", name_field: str = "name", embedding_field: str = "embedding", embedding_dim: int = 768, custom_mapping: Optional[dict] = None, excluded_meta_data: Optional[list] = None, faq_question_field: Optional[str] = None, analyzer: str = "standard", scheme: str = "http", ca_certs: bool = False, verify_certs: bool = True, create_index: bool = True, update_existing_documents: bool = False, refresh_type: str = "wait_for", similarity="dot_product", timeout=30, return_embedding: Optional[bool] = True)

Freemanlabs · 2020-12-02T19:52:40Z

Another question I have is why is DPR giving issues? BM25 and TFIDF works okay

lalitpagaria · 2020-12-02T20:28:37Z

Well BM25 and TFIDF do not use script_score query. But DPR needs query_by_embedding hence it use the similarity function to compare document vectors.

Can you manually increase timeout at this place https://github.com/deepset-ai/haystack/blob/master/haystack/document_store/elasticsearch.py#L574 and use that code to test.

If timeout increase also do not solve you issue then you can try DPR with FAISS document store.
Otherwise try to use custom plugin, but in order to use it you would need to make few change in the elasticsearch.py in haystack.

@tholor I think we need to add timeout customisation at following place as well instead of hardcoding.
https://github.com/deepset-ai/haystack/blob/master/haystack/document_store/elasticsearch.py#L574

Freemanlabs · 2020-12-03T07:33:44Z

Thank you for your response...

Can you manually increase timeout at this place https://github.com/deepset-ai/haystack/blob/master/haystack/document_store/elasticsearch.py#L574 and use that code to test.

I do not know how to locate this file to change manually. Please assist

Freemanlabs · 2020-12-03T08:46:13Z

If timeout increase also do not solve you issue then you can try DPR with FAISS document store.

I am trying this FAISSDocumentStore. If I do faiss_document_store.get_document_count() I see 874 (which is the total number of documents I have). But when I pass it to the retriever like so dpr_retriever = DensePassageRetriever(document_store=faiss_document_store) and try to get answers from finder, I get this INFO

12/03/2020 08:26:05 - INFO - haystack.finder - Got 0 candidates from retriever 12/03/2020 08:26:05 - INFO - haystack.finder - Retriever did not return any documents. Skipping reader ...

when I inspect the retriver like so :
dpr_retriever.retrieve(query="I would expect the remaining TFC protection to remain protected in both")

All I see is an empty array

Creating Embeddings: 100%|██████████| 1/1 [00:00<00:00, 4.84 Batches/s] []

Too many frustration as a first time user of Haysack I must say. My supervisor feels I am not doing anything, because what he should be getting is results not issues.

tholor · 2020-12-03T08:58:35Z

Sorry, to hear that you have trouble. From what I see, the main issues were around your usage on Colab (Mounting problem, connection time out ...). We will give our best to simplify the experience there, but as mentioned above, Colab has some severe resource limitations when running heavy external services like Elasticsearch or FAISS.

All I see is an empty array

Did you call document_store.update_embeddings(retriever) as described in this tutorial?

You can also jump on a quick call with one of our engineers if you need further help or share your colab notebook here.

Freemanlabs · 2020-12-03T14:11:58Z

Thanks for pointing out that resource. I was practically following the Haystack documentation.
All seems well now

tholor · 2020-12-07T13:51:03Z

Great! Closing this now. Feel free to re-open if the problem comes up again...

Freemanlabs added the type:bug Something isn't working label Dec 1, 2020

tholor closed this as completed Dec 7, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dense Passage Retrieval (ConnectionTimeoutError) #644

Dense Passage Retrieval (ConnectionTimeoutError) #644

Freemanlabs commented Dec 1, 2020

Freemanlabs commented Dec 2, 2020

tholor commented Dec 2, 2020

Freemanlabs commented Dec 2, 2020

tholor commented Dec 2, 2020

Freemanlabs commented Dec 2, 2020

lalitpagaria commented Dec 2, 2020

Freemanlabs commented Dec 2, 2020

lalitpagaria commented Dec 2, 2020

lalitpagaria commented Dec 2, 2020

Freemanlabs commented Dec 2, 2020

Freemanlabs commented Dec 2, 2020

lalitpagaria commented Dec 2, 2020

Freemanlabs commented Dec 3, 2020

Freemanlabs commented Dec 3, 2020

tholor commented Dec 3, 2020 •

edited

Loading

Freemanlabs commented Dec 3, 2020

tholor commented Dec 7, 2020

Dense Passage Retrieval (ConnectionTimeoutError) #644

Dense Passage Retrieval (ConnectionTimeoutError) #644

Comments

Freemanlabs commented Dec 1, 2020

Freemanlabs commented Dec 2, 2020

tholor commented Dec 2, 2020

Freemanlabs commented Dec 2, 2020

tholor commented Dec 2, 2020

Freemanlabs commented Dec 2, 2020

lalitpagaria commented Dec 2, 2020

Freemanlabs commented Dec 2, 2020

lalitpagaria commented Dec 2, 2020

lalitpagaria commented Dec 2, 2020

Freemanlabs commented Dec 2, 2020

Freemanlabs commented Dec 2, 2020

lalitpagaria commented Dec 2, 2020

Freemanlabs commented Dec 3, 2020

Freemanlabs commented Dec 3, 2020

tholor commented Dec 3, 2020 • edited Loading

Freemanlabs commented Dec 3, 2020

tholor commented Dec 7, 2020

tholor commented Dec 3, 2020 •

edited

Loading