Rethinking generalization requires revisiting old ideas: statistical mechanics approaches and complex learning behavior

Martin, Charles H.; Mahoney, Michael W.

Computer Science > Machine Learning

arXiv:1710.09553 (cs)

[Submitted on 26 Oct 2017 (v1), last revised 17 Feb 2019 (this version, v2)]

Title:Rethinking generalization requires revisiting old ideas: statistical mechanics approaches and complex learning behavior

Authors:Charles H. Martin, Michael W. Mahoney

View PDF

Abstract:We describe an approach to understand the peculiar and counterintuitive generalization properties of deep neural networks. The approach involves going beyond worst-case theoretical capacity control frameworks that have been popular in machine learning in recent years to revisit old ideas in the statistical mechanics of neural networks. Within this approach, we present a prototypical Very Simple Deep Learning (VSDL) model, whose behavior is controlled by two control parameters, one describing an effective amount of data, or load, on the network (that decreases when noise is added to the input), and one with an effective temperature interpretation (that increases when algorithms are early stopped). Using this model, we describe how a very simple application of ideas from the statistical mechanics theory of generalization provides a strong qualitative description of recently-observed empirical results regarding the inability of deep neural networks not to overfit training data, discontinuous learning and sharp transitions in the generalization properties of learning algorithms, etc.

Comments:	31 pages; added brief discussion of recent papers that use/extend these ideas
Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:1710.09553 [cs.LG]
	(or arXiv:1710.09553v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1710.09553

Submission history

From: Michael Mahoney [view email]
[v1] Thu, 26 Oct 2017 06:08:39 UTC (407 KB)
[v2] Sun, 17 Feb 2019 05:57:09 UTC (2,044 KB)

Computer Science > Machine Learning

Title:Rethinking generalization requires revisiting old ideas: statistical mechanics approaches and complex learning behavior

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Rethinking generalization requires revisiting old ideas: statistical mechanics approaches and complex learning behavior

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators