Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Could you help resolve this error from finder while inferencing (trying out your tutorial six) #218

Closed
usaraj opened this issue Jul 10, 2020 · 3 comments
Assignees

Comments

@usaraj
Copy link

usaraj commented Jul 10, 2020

Hi! I am a newbie trying out your tutorial 6 with dense retriever and a fine tuned custom roberta2-squad2 reader model.
When I first downloaded from git via pip two days ago it was working fine.
However, since this morning I am getting an error during prediction. Here is the traceback with error:

TypeError Traceback (most recent call last)
in ()
2 q = input('Question: ')
3 #prediction = finder.get_answers_via_similar_questions(question=q, top_k_retriever=5)
----> 4 prediction = finder.get_answers(question=q, top_k_retriever=10, top_k_reader=5)
5 print('query: {}'.format(q))
6 print_answers(prediction, details="minimal")

/content/drive/My Drive/haystack/haystack/reader/farm.py in predict(self, question, documents, top_k)
243 # get answers from QA model
244 predictions = self.inferencer.inference_from_dicts(
--> 245 dicts=input_dicts, return_json=True, multiprocessing_chunksize=1
246 )
247 # assemble answers from all the different documents & format them.

TypeError: inference_from_dicts() got an unexpected keyword argument 'return_json'

Any help to resolve this issue would be greatly appreciated?

@brandenchan
Copy link
Contributor

Hi usaraj, that error makes me think that you might be using a later version of Haystack but an earlier and incompatible version of FARM. Could you check which versions you have?

If this is the case, you can solve this by installing the latest master version of Haystack from GitHub (which will in turn install the right version of FARM). You can do this with something like:
git clone https://github.com/deepset-ai/haystack.git
cd haystack
pip install -e .

Let me know how this goes!

@usaraj
Copy link
Author

usaraj commented Jul 13, 2020

Hi brandehchan, Thanks for your response. I think when I installed it it was from a specific branch of haystack in the tutorial. I am using colab and have to reinstall it every time I restart the notebook. Probably should have frozen the branch that I originally worked on.

In colab, another issue I am facing is the version of torch required by farm uninstalls the version preinstalled and then colab complains of changing run time to standard from GPU. Right now I am reinstalling torch 1.5.0+cu101 to get GPU to work.

I tried your suggestion. After pip isntall
This is the error when I import the reader. I tried importing finder, that is failing too. Here is the traceback.

After pip isntall
from haystack.reader.farm import FARMReader


ModuleNotFoundError Traceback (most recent call last)
in ()
----> 1 from haystack.reader.farm import FARMReader

ModuleNotFoundError: No module named 'haystack.reader'

@usaraj
Copy link
Author

usaraj commented Jul 13, 2020

Hi brandehchan, the error in my earlier post refers to when I was trying to clone haystack to my google drive and from within colab install from haystack directory in mounted drive.

When it did not work, I tried your suggestion and installed haystack directly into the colab content directory (non-persistent environment), I was able to train a model and save it.

But when I use the saved model to evaluate directly on a small dataset squad formatted file, I get the following KeyError:

Here are the relevant commands and trace back from colab:

Evaluation of Reader can also be done directly on a SQuAD-formatted file

reader_eval_results = new_reader.eval_on_file("/content/drive/My Drive/data", "cdqa-v9.json", device=device)
print("Reader Top-N-Recall:", reader_eval_results["top_n_recall"])
print("Reader Exact Match:", reader_eval_results["EM"])
print("Reader F1-Score:", reader_eval_results["f1"])

Preprocessing Dataset /content/drive/My Drive/data/cdqa-v9.json: 100%|██████████| 18/18 [00:01<00:00, 16.38 Dicts/s]
Evaluating: 100%|██████████| 3/3 [00:04<00:00, 1.44s/it]

KeyError Traceback (most recent call last)
in ()
4 # Evaluation of Reader can also be done directly on a SQuAD-formatted file
5 # without passing the data to Elasticsearch
----> 6 reader_eval_results = new_reader.eval_on_file("/content/drive/My Drive/data", "cdqa-v9.json", device=device)
7
8 print(reader_eval_results)

/content/haystack/haystack/reader/farm.py in eval_on_file(self, data_dir, test_filename, device)
330 "EM": eval_results[0]["EM"],
331 "f1": eval_results[0]["f1"],
--> 332 "top_n_recall": eval_results[0]["top_n_recall"]
333 }
334 return results

KeyError: 'top_n_recall'

==========================
I was able to train and evaluate custom models till I started having issues from probably updated code in haystack / farm. for your info farm versions is at 0.4.5 and all models are as per your requirements.txt

Your help is greatly appreciated.

@usaraj usaraj closed this as completed Jul 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants