End-to-End Training of Neural Retrievers for Open-Domain Question Answering

Sachan, Devendra Singh; Patwary, Mostofa; Shoeybi, Mohammad; Kant, Neel; Ping, Wei; Hamilton, William L; Catanzaro, Bryan

Computer Science > Computation and Language

arXiv:2101.00408 (cs)

[Submitted on 2 Jan 2021 (v1), last revised 2 Jun 2021 (this version, v2)]

Title:End-to-End Training of Neural Retrievers for Open-Domain Question Answering

Authors:Devendra Singh Sachan, Mostofa Patwary, Mohammad Shoeybi, Neel Kant, Wei Ping, William L Hamilton, Bryan Catanzaro

View PDF

Abstract:Recent work on training neural retrievers for open-domain question answering (OpenQA) has employed both supervised and unsupervised approaches. However, it remains unclear how unsupervised and supervised methods can be used most effectively for neural retrievers. In this work, we systematically study retriever pre-training. We first propose an approach of unsupervised pre-training with the Inverse Cloze Task and masked salient spans, followed by supervised finetuning using question-context pairs. This approach leads to absolute gains of 2+ points over the previous best result in the top-20 retrieval accuracy on Natural Questions and TriviaQA datasets.
We also explore two approaches for end-to-end supervised training of the reader and retriever components in OpenQA models. In the first approach, the reader considers each retrieved document separately while in the second approach, the reader considers all the retrieved documents together. Our experiments demonstrate the effectiveness of these approaches as we obtain new state-of-the-art results. On the Natural Questions dataset, we obtain a top-20 retrieval accuracy of 84, an improvement of 5 points over the recent DPR model. In addition, we achieve good results on answer extraction, outperforming recent models like REALM and RAG by 3+ points. We further scale up end-to-end training to large models and show consistent gains in performance over smaller models.

Comments:	ACL 2021
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2101.00408 [cs.CL]
	(or arXiv:2101.00408v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2101.00408

Submission history

From: Devendra Singh Sachan [view email]
[v1] Sat, 2 Jan 2021 09:05:34 UTC (7,560 KB)
[v2] Wed, 2 Jun 2021 02:46:38 UTC (7,644 KB)

Computer Science > Computation and Language

Title:End-to-End Training of Neural Retrievers for Open-Domain Question Answering

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:End-to-End Training of Neural Retrievers for Open-Domain Question Answering

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators