Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Farm top_k reader error #34

Closed
aadil-srivastava01 opened this issue Mar 2, 2020 · 2 comments
Closed

Farm top_k reader error #34

aadil-srivastava01 opened this issue Mar 2, 2020 · 2 comments
Assignees

Comments

@aadil-srivastava01
Copy link
Contributor

When using finder.get_answers() with tfidf-retriever and farm reader, top_k_reader parameter has no effect on retrieving the top K answers. It will always return the top 3 answers as FarmReader top_k_per_candidate is set to 3 by default. So in order to get n number of answers, we will have to initialize FarmReader's top_k_per_candidate by n.

@tholor
Copy link
Member

tholor commented Mar 2, 2020

Hey @aadil-srivastava01,

We did some changes on these parameters last week. Can you please check how many candidate docs you retriever returns? If you have e.g. only 1 candidate here, it would explain the observed behavior. Otherwise, I am happy to dig in deeper.

To give you some background: There are a couple of different aggregation steps involved. How it should work:

  • top_k_per_sample: the model creates this many predictions for one text passage that it can process at once (bounded by max_sequence_length of the model, e.g. 256 tokens)
  • top_k_per_candidate: the predictions per sample are aggregated to this many answers per "candidate" (i.e. for one document that the retriever send to the reader, which, depending on the length, consists of 1-x samples )
  • top_k_reader: the predictions per candidate are aggregated again to yield the final answers that are returned by the Finder

This means you can only return min(top_k_per_candidate* candidates, top_k_reader) answers.

In any case, we should try to make this more user-friendly in the future. Either by an info message or some automated parameter adjustments for these special cases.

@tholor tholor self-assigned this Mar 2, 2020
@aadil-srivastava01
Copy link
Contributor Author

aadil-srivastava01 commented Mar 5, 2020

farm_issue2
farm_issue3

Please refer to the highlighted section in the attached images. I guess the problem is still there. Only when you initialize farm reader with the top_k_reader then will give you the number of specified answers else it will give the default number of answers which is 3, which is the value of its default initialization, making finder.get_answers having no effect on the generated output.

masci pushed a commit that referenced this issue Nov 27, 2023
Handle components with partial input
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants