Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training of FARMReader uses too many and potentially wrong no answer labels due to bug in SquadProcessor #2771

Closed
mathislucka opened this issue Jul 6, 2022 · 3 comments
Labels
P3 Low priority, leave it in the backlog topic:reader topic:train type:documentation Improvements on the docs type:feature New feature or request wontfix This will not be worked on

Comments

@mathislucka
Copy link
Member

Describe the bug

When training a QA model with the FARMReader.train() method, the SquadProcessor will be used to convert the SQuAD-style json file to training samples.

The context for a question might be longer than the model's token limit. Therefore, the processor splits the full context into smaller passages. It then checks, if the original answer is present in a passage using its character positions. If the answer is not present in the passage it automatically uses the sample as a no-answer sample.

The code is here:

if passage_len_t > answer_start_t >= 0 and passage_len_t >= answer_end_t >= 0:

This creates multiple issues:

  1. the user is not aware of this behaviour
  2. for long documents, there is too many no-answer samples
  3. the answer might be present in the passage but it was not labeled

Expected behavior

  1. Give the user a parameter max_no_answer_per_context where they can decide how many no-answer samples should be created.
  2. Check if the actual answer has a string match in that passage and never use samples as no-answer sample if there is a string match/overlap
@sjrl
Copy link
Contributor

sjrl commented Aug 3, 2022

Hi @julian-risch it sounds like we agree that we would like to have control over how many no-answer labels are created during training in a FARMReader model, so the implementation of something like max_no_answer_per_context. As for the second suggestion, it sounds like we thought it would be best to perhaps print it as a warning message instead of removing the no-answer sample.

In regards to evaluation, the issue outlined here does not affect evaluation because it was determined that evaluation does aggregate results per file which is why Michel closed the issue #2622.

@sjrl sjrl added type:feature New feature or request and removed type:bug Something isn't working labels Aug 3, 2022
@sjrl
Copy link
Contributor

sjrl commented Aug 3, 2022

Additionally, we also agreed that adding additional documentation to the FARMReader training to explain how no-answer labels are automatically generated would be very helpful.

@julian-risch
Copy link
Member

Alright, thank you. I'll tag @brandenchan and @agnieszka-m so that they also learn about the needed documentation update .

@sjrl sjrl added the type:documentation Improvements on the docs label Aug 3, 2022
@masci masci added the P2 Medium priority, add to the next sprint if no P1 available label Nov 24, 2022
@masci masci added P3 Low priority, leave it in the backlog and removed P2 Medium priority, add to the next sprint if no P1 available labels Jan 25, 2023
@masci masci added the wontfix This will not be worked on label Feb 26, 2024
@masci masci closed this as completed Feb 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P3 Low priority, leave it in the backlog topic:reader topic:train type:documentation Improvements on the docs type:feature New feature or request wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

4 participants