A context encoder for audio inpainting

Marafioti, Andrés; Perraudin, Nathanaël; Holighaus, Nicki; Majdak, Piotr

Computer Science > Sound

arXiv:1810.12138v1 (cs)

[Submitted on 29 Oct 2018 (this version), latest version 18 Feb 2022 (v3)]

Title:A context encoder for audio inpainting

Authors:Andrés Marafioti, Nathanaël Perraudin, Nicki Holighaus, Piotr Majdak

View PDF

Abstract:We studied the ability of deep neural networks (DNNs) to restore missing audio content based on its context, a process usually referred to as audio inpainting. We focused on gaps in the range of tens of milliseconds, a condition which has not received much attention yet. The proposed DNN structure was trained on audio signals containing music and musical instruments, separately, with 64-ms long gaps. The input to the DNN was the context, i.e., the signal surrounding the gap, transformed into time-frequency (TF) coefficients. Two networks were analyzed, a DNN with complex-valued TF coefficient output and another one producing magnitude TF coefficient output, both based on the same network architecture. We found significant differences in the inpainting results between the two DNNs. In particular, we discuss the observation that the complex-valued DNN fails to produce reliable results outside the low frequency range. Further, our results were compared to those obtained from a reference method based on linear predictive coding (LPC). For instruments, our DNNs were not able to match the performance of reference method, although the magnitude network provided good results as well. For music, however, our magnitude DNN significantly outperformed the reference method, demonstrating a generally good usability of the proposed DNN structure for inpainting complex audio signals like music. This paves the road towards future, more sophisticated audio inpainting approaches based on DNNs.

Comments:	12 pages, 3 tables, 8 images, 7 figures
Subjects:	Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:1810.12138 [cs.SD]
	(or arXiv:1810.12138v1 [cs.SD] for this version)
	https://doi.org/10.48550/arXiv.1810.12138

Submission history

From: Andrés Marafioti MSc [view email]
[v1] Mon, 29 Oct 2018 14:15:30 UTC (1,864 KB)
[v2] Thu, 10 Oct 2019 12:50:22 UTC (1,693 KB)
[v3] Fri, 18 Feb 2022 15:42:37 UTC (1,175 KB)

Computer Science > Sound

Title:A context encoder for audio inpainting

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Sound

Title:A context encoder for audio inpainting

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators