On the Power of Pre-training for Generalization in RL: Provable Benefits and Hardness

Ye, Haotian; Chen, Xiaoyu; Wang, Liwei; Du, Simon S.

Computer Science > Machine Learning

arXiv:2210.10464 (cs)

[Submitted on 19 Oct 2022 (v1), last revised 29 Jun 2023 (this version, v2)]

Title:On the Power of Pre-training for Generalization in RL: Provable Benefits and Hardness

Authors:Haotian Ye, Xiaoyu Chen, Liwei Wang, Simon S. Du

View PDF

Abstract:Generalization in Reinforcement Learning (RL) aims to learn an agent during training that generalizes to the target environment. This paper studies RL generalization from a theoretical aspect: how much can we expect pre-training over training environments to be helpful? When the interaction with the target environment is not allowed, we certify that the best we can obtain is a near-optimal policy in an average sense, and we design an algorithm that achieves this goal. Furthermore, when the agent is allowed to interact with the target environment, we give a surprising result showing that asymptotically, the improvement from pre-training is at most a constant factor. On the other hand, in the non-asymptotic regime, we design an efficient algorithm and prove a distribution-based regret bound in the target environment that is independent of the state-action space.

Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2210.10464 [cs.LG]
	(or arXiv:2210.10464v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2210.10464

Submission history

From: Haotian Ye [view email]
[v1] Wed, 19 Oct 2022 10:58:24 UTC (69 KB)
[v2] Thu, 29 Jun 2023 03:26:39 UTC (80 KB)

Computer Science > Machine Learning

Title:On the Power of Pre-training for Generalization in RL: Provable Benefits and Hardness

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:On the Power of Pre-training for Generalization in RL: Provable Benefits and Hardness

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators