Add Generative QA Models like RAG #443

tholor · 2020-09-29T06:57:19Z

What?
So far most generative QA models were not really useful in practice, because they could only answer very generic questions that were included in their "wiki + web" training corpora. For use cases in the industry we mostly want to:

ask domain-specific questions
get the answer from a specified, reliable corpus
have a possibility to check the answer (e.g. by inspecting the document / context where it came from)

Pure generative models don't fulfill these requirements. However, recent retrieval-augmented approaches could be interesting to test.

How?

The latest transformers release comes with the RAG model from Facebook (https://ai.facebook.com/blog/retrieval-augmented-generation-streamlining-the-creation-of-intelligent-natural-language-processing-models). We could add a "TransformersGenerator" class in Haystack that gets documents from the retriever and generates the answer conditioned on those.

Weilin37 · 2020-09-29T21:35:41Z

This would be really cool for domain-specific literature review

nsankar · 2020-09-30T15:33:48Z

@tholor As it is now in RAG, Is it correct to pass the haystack retriever output as inputs to RAG as cited in the snippet below in the RAG reference code given here ?

input_dict = tokenizer.prepare_seq2seq_batch(pass formatted haystack retriever outputs here, return_tensors="pt")
input_ids = input_dict["input_ids"]
outputs = model(input_ids=input_ids, labels=input_dict["labels"])

tholor · 2020-10-05T11:19:11Z

@nsankar I didn't have time yet to look into this, but I believe something like this should be possible with the RagTokenForGeneration class. Your "haystack retriever outputs" would be pretty much standard strings that we can pass there. Did you try it already?

nsankar · 2020-10-06T03:56:44Z

@tholor Yet to try this. I shall feedback once I try. Thank you for the inputs.

Weilin37 · 2020-10-23T16:49:37Z

I'm here to cheerlead you guys on! Can't wait for this

tholor · 2020-10-23T16:58:01Z

Thanks to @lalitpagaria RAG integration in #484 is almost done!
We just need to do a FARM release before we can merge. The current plan is to do the release next week.

If you are very eager to try, you could already test it on the branch itself. There's already a small tutorial notebook!
Only be aware to run pip install -e . once to install the latest FARM version from master which is required for RAG.

Weilin37 · 2020-10-23T17:03:58Z

@tholor I am definitely eager to try! Where is the tutorial notebook? I'll try it on the branch

tholor · 2020-10-23T17:11:33Z

@Weilin37 You can find the preliminary version here: https://github.com/lalitpagaria/haystack/blob/implement_RAG/tutorials/Tutorial7_RAG_Generator.ipynb

lalitpagaria · 2020-10-23T17:40:52Z

@Weilin37 In this notebook you need to use different haystack version.
update -

!pip install git+https://github.com/deepset-ai/haystack/
!pip install urllib3==1.25.4

to -

!pip install git+https://github.com/lalitpagaria/haystack.git@implement_RAG
!pip install urllib3==1.25.4

Weilin37 · 2020-10-23T17:47:37Z

Thank you!

I am a "No module named 'haystack.generator'".
I guess this is the pip install -e . thing? However I seem to get an error when I run that command. Something about setup.py not found and directory cannot be installed in editable mode.

lalitpagaria · 2020-10-23T17:53:09Z

Please try this script https://github.com/lalitpagaria/haystack/blob/implement_RAG/tutorials/Tutorial7_RAG_Generator.py

Install pip install git+https://github.com/lalitpagaria/haystack.git@implement_RAG in separate virtualenv.

tholor · 2020-11-02T12:52:11Z

Fixed by #484

tholor added the type:feature New feature or request label Sep 29, 2020

nsankar mentioned this issue Sep 30, 2020

RAG Language model / Retriever #452

Closed

tholor mentioned this issue Oct 11, 2020

RAG - how to precompute custom document index? huggingface/transformers#7462

Closed

This was referenced Oct 13, 2020

[RAG] Bumping up transformers version to 3.3.x deepset-ai/FARM#579

Merged

[RAG] Integrate "Retrieval-Augmented Generation" with Haystack #484

Merged

Adding RAG to text-generation pipeline huggingface/transformers#7777

Closed

tholor closed this as completed Nov 2, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Generative QA Models like RAG #443

Add Generative QA Models like RAG #443

tholor commented Sep 29, 2020

Weilin37 commented Sep 29, 2020

nsankar commented Sep 30, 2020 •

edited

Loading

tholor commented Oct 5, 2020

nsankar commented Oct 6, 2020

Weilin37 commented Oct 23, 2020

tholor commented Oct 23, 2020 •

edited

Loading

Weilin37 commented Oct 23, 2020

tholor commented Oct 23, 2020

lalitpagaria commented Oct 23, 2020 •

edited

Loading

Weilin37 commented Oct 23, 2020

lalitpagaria commented Oct 23, 2020

tholor commented Nov 2, 2020

Add Generative QA Models like RAG #443

Add Generative QA Models like RAG #443

Comments

tholor commented Sep 29, 2020

Weilin37 commented Sep 29, 2020

nsankar commented Sep 30, 2020 • edited Loading

tholor commented Oct 5, 2020

nsankar commented Oct 6, 2020

Weilin37 commented Oct 23, 2020

tholor commented Oct 23, 2020 • edited Loading

Weilin37 commented Oct 23, 2020

tholor commented Oct 23, 2020

lalitpagaria commented Oct 23, 2020 • edited Loading

Weilin37 commented Oct 23, 2020

lalitpagaria commented Oct 23, 2020

tholor commented Nov 2, 2020

nsankar commented Sep 30, 2020 •

edited

Loading

tholor commented Oct 23, 2020 •

edited

Loading

lalitpagaria commented Oct 23, 2020 •

edited

Loading