Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect generation of training data in GPL training #2751

Closed
1 task
aditya-malte opened this issue Jul 1, 2022 · 5 comments
Closed
1 task

Incorrect generation of training data in GPL training #2751

aditya-malte opened this issue Jul 1, 2022 · 5 comments
Assignees

Comments

@aditya-malte
Copy link

aditya-malte commented Jul 1, 2022

Describe the bug
Hi,
I referred this tutorial for training my own GPL model.
On closer observation, I noticed two things:

  1. "pos" and "neg" are switched sometimes, this is especially more evident when (margin) score is negative. Also why is the score negative? shouldn't it always (or atleast mostly) be positive as the CE+>CE- a vast majority of times?.
  2. Questions generated are sometimes totally incorrect. In the sense that they definitely appear to have been generated from one of the many documents but do not match either the neg or pos.

Expected behavior
"pos" and "neg" to not be switched at some places
AND
labels to be more accurate

Additional context
Add any other context about the problem here, like document types / preprocessing steps / settings of reader etc.

To Reproduce
Steps to reproduce the behavior

FAQ Check

System:

  • OS: Ubuntu
  • GPU/CPU: A6000
  • Haystack version (commit or version number): 1.5.1rc0
  • DocumentStore: Elastic
  • Reader: ..
  • Retriever: EmbeddingRetriever("sentence-transformers/msmarco-distilbert-base-tas-b")
@aditya-malte
Copy link
Author

I have a feeling that the pseudo_label_generator.py file in haystack might be having some issues while generating training data.

@julian-risch
Copy link
Member

Hi @aditya-malte we compared the results generated by Haystack's implementation of GPL to the results generated by the reference implementation and didn't find any differences. What models did you use for the question generator and the cross encoder? The ones used in the tutorial or did you change them? Did you make any changes to the tutorial, for example, did you use other data? Maybe @vblagoje can help here?

@vblagoje vblagoje self-assigned this Jul 4, 2022
@vblagoje
Copy link
Member

vblagoje commented Jul 4, 2022

@aditya-malte, thanks for your report. The questions Julian posted are what I would have asked. But maybe it would be simpler if you shared your notebook so we can take a look?

@vblagoje
Copy link
Member

Ping @aditya-malte , any updates? Have you noticed these issues in the GPL tutorial? Would love to hear back from you on this one.

@vblagoje
Copy link
Member

vblagoje commented Sep 1, 2022

I am closing this issue due to a lack of response from the issuer. We'll reopen if we discover issues with a unit test or clear proof of an issue with GPL.

@vblagoje vblagoje closed this as completed Sep 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants