Learning to generate images with perceptual similarity metrics

Ridgeway, Karl; Snell, Jake; Roads, Brett; Zemel, Richard; Mozer, Michael

Computer Science > Machine Learning

arXiv:1511.06409v1 (cs)

[Submitted on 19 Nov 2015 (this version), latest version 24 Jan 2017 (v3)]

Title:Learning to generate images with perceptual similarity metrics

Authors:Karl Ridgeway, Jake Snell, Brett Roads, Richard Zemel, Michael Mozer

View PDF

Abstract:Deep networks are increasingly being applied to problems involving image synthesis, e.g., generating images from textual descriptions, or generating reconstructions of an input image in an autoencoder architecture. Supervised training of image-synthesis networks typically uses a pixel-wise squared error (SE) loss to indicate the mismatch between a generated image and its corresponding target image. We propose to instead use a loss function that is better calibrated to human perceptual judgments of image quality: the structural-similarity (SSIM) score of Wang, Bovik, Sheikh, and Simoncelli (2004). Because the SSIM score is differentiable, it is easily incorporated into gradient-descent learning. We compare the consequences of using SSIM versus SE loss on representations formed in deep autoencoder and recurrent neural network architectures. SSIM-optimized representations yield a superior basis for image classification compared to SE-optimized representations. Further, human observers prefer images generated by the SSIM-optimized networks by nearly a 7:1 ratio. Just as computer vision has advanced through the use of convolutional architectures that mimic the structure of the mammalian visual system, we argue that significant additional advances can be made in modeling images through the use of training objectives that are well aligned to characteristics of human perception.

Comments:	Submitted to ICLR 2016
Subjects:	Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:1511.06409 [cs.LG]
	(or arXiv:1511.06409v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1511.06409

Submission history

From: Karl Ridgeway [view email]
[v1] Thu, 19 Nov 2015 21:57:46 UTC (405 KB)
[v2] Thu, 17 Mar 2016 17:21:56 UTC (652 KB)
[v3] Tue, 24 Jan 2017 02:03:41 UTC (4,999 KB)

Computer Science > Machine Learning

Title:Learning to generate images with perceptual similarity metrics

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Learning to generate images with perceptual similarity metrics

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators