Deep Transformer Q-Networks for Partially Observable Reinforcement Learning

Esslinger, Kevin; Platt, Robert; Amato, Christopher

Computer Science > Machine Learning

arXiv:2206.01078 (cs)

[Submitted on 2 Jun 2022 (v1), last revised 10 Nov 2022 (this version, v2)]

Title:Deep Transformer Q-Networks for Partially Observable Reinforcement Learning

Authors:Kevin Esslinger, Robert Platt, Christopher Amato

View PDF

Abstract:Real-world reinforcement learning tasks often involve some form of partial observability where the observations only give a partial or noisy view of the true state of the world. Such tasks typically require some form of memory, where the agent has access to multiple past observations, in order to perform well. One popular way to incorporate memory is by using a recurrent neural network to access the agent's history. However, recurrent neural networks in reinforcement learning are often fragile and difficult to train, susceptible to catastrophic forgetting and sometimes fail completely as a result. In this work, we propose Deep Transformer Q-Networks (DTQN), a novel architecture utilizing transformers and self-attention to encode an agent's history. DTQN is designed modularly, and we compare results against several modifications to our base model. Our experiments demonstrate the transformer can solve partially observable tasks faster and more stably than previous recurrent approaches.

Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2206.01078 [cs.LG]
	(or arXiv:2206.01078v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2206.01078

Submission history

From: Kevin Esslinger [view email]
[v1] Thu, 2 Jun 2022 15:04:18 UTC (2,346 KB)
[v2] Thu, 10 Nov 2022 15:04:36 UTC (2,915 KB)

Computer Science > Machine Learning

Title:Deep Transformer Q-Networks for Partially Observable Reinforcement Learning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Deep Transformer Q-Networks for Partially Observable Reinforcement Learning

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators