Learning Dynamics and Generalization in Reinforcement Learning

Lyle, Clare; Rowland, Mark; Dabney, Will; Kwiatkowska, Marta; Gal, Yarin

Computer Science > Machine Learning

arXiv:2206.02126 (cs)

[Submitted on 5 Jun 2022]

Title:Learning Dynamics and Generalization in Reinforcement Learning

Authors:Clare Lyle, Mark Rowland, Will Dabney, Marta Kwiatkowska, Yarin Gal

View PDF

Abstract:Solving a reinforcement learning (RL) problem poses two competing challenges: fitting a potentially discontinuous value function, and generalizing well to new observations. In this paper, we analyze the learning dynamics of temporal difference algorithms to gain novel insight into the tension between these two objectives. We show theoretically that temporal difference learning encourages agents to fit non-smooth components of the value function early in training, and at the same time induces the second-order effect of discouraging generalization. We corroborate these findings in deep RL agents trained on a range of environments, finding that neural networks trained using temporal difference algorithms on dense reward tasks exhibit weaker generalization between states than randomly initialized networks and networks trained with policy gradient methods. Finally, we investigate how post-training policy distillation may avoid this pitfall, and show that this approach improves generalization to novel environments in the ProcGen suite and improves robustness to input perturbations.

Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2206.02126 [cs.LG]
	(or arXiv:2206.02126v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2206.02126

Submission history

From: Clare Lyle [view email]
[v1] Sun, 5 Jun 2022 08:49:16 UTC (4,444 KB)

Computer Science > Machine Learning

Title:Learning Dynamics and Generalization in Reinforcement Learning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Learning Dynamics and Generalization in Reinforcement Learning

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators