Deep Shallow Fusion for RNN-T Personalization

Le, Duc; Keren, Gil; Chan, Julian; Mahadeokar, Jay; Fuegen, Christian; Seltzer, Michael L.

Computer Science > Computation and Language

arXiv:2011.07754 (cs)

[Submitted on 16 Nov 2020]

Title:Deep Shallow Fusion for RNN-T Personalization

Authors:Duc Le, Gil Keren, Julian Chan, Jay Mahadeokar, Christian Fuegen, Michael L. Seltzer

View PDF

Abstract:End-to-end models in general, and Recurrent Neural Network Transducer (RNN-T) in particular, have gained significant traction in the automatic speech recognition community in the last few years due to their simplicity, compactness, and excellent performance on generic transcription tasks. However, these models are more challenging to personalize compared to traditional hybrid systems due to the lack of external language models and difficulties in recognizing rare long-tail words, specifically entity names. In this work, we present novel techniques to improve RNN-T's ability to model rare WordPieces, infuse extra information into the encoder, enable the use of alternative graphemic pronunciations, and perform deep fusion with personalized language models for more robust biasing. We show that these combined techniques result in 15.4%-34.5% relative Word Error Rate improvement compared to a strong RNN-T baseline which uses shallow fusion and text-to-speech augmentation. Our work helps push the boundary of RNN-T personalization and close the gap with hybrid systems on use cases where biasing and entity recognition are crucial.

Comments:	To appear at SLT 2021
Subjects:	Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2011.07754 [cs.CL]
	(or arXiv:2011.07754v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2011.07754

Submission history

From: Duc Le [view email]
[v1] Mon, 16 Nov 2020 07:13:58 UTC (162 KB)

Computer Science > Computation and Language

Title:Deep Shallow Fusion for RNN-T Personalization

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Deep Shallow Fusion for RNN-T Personalization

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators