In this paper we explore how actor-critic methods in deep reinforcement learning, in particular Asynchronous Advantage Actor-Critic (A3C), can be extended with agent modeling. Inspired by recent works on representation learning and multiagent deep reinforcement learning, we propose two architectures to perform agent modeling: the first one based on parameter sharing, and the second one based on agent policy features. Both architectures aim to learn other agents’ policies as auxiliary tasks, besides the standard actor (policy) and critic (values). We performed experiments in both cooperative and competitive domains. The former is a problem of coordinated multiagent object transportation and the latter is a two-player mini version of the Pommerman game. Our results show that the proposed architectures stabilize learning and outperform the standard A3C architecture when learning a best response in terms of expected rewards.
Bibtex
@inproceedings{hern2019agent,
title={{Agent Modeling as Auxiliary Task for Deep Reinforcement Learning}},
author={Pablo Hernandez-Leal and Bilal Kartal and Matthew E. Taylor},
year={2019},
booktitle={AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment},
}
Related Research
-
Our NeurIPS 2021 Reading List
Our NeurIPS 2021 Reading List
Y. Cao, K. Y. C. Lui, T. Durand, J. He, P. Xu, N. Mehrasa, A. Radovic, A. Lehrmann, R. Deng, A. Abdi, M. Schlegel, and S. Liu.
Computer Vision; Data Visualization; Graph Representation Learning; Learning And Generalization; Natural Language Processing; Optimization; Reinforcement Learning; Time series Modelling; Unsupervised Learning
Research
-
Heterogeneous Multi-task Learning with Expert Diversity
Heterogeneous Multi-task Learning with Expert Diversity
G. Oliveira, and F. Tung.
Computer Vision; Natural Language Processing; Reinforcement Learning
Research
-
Desired characteristics for real-world RL agents
Desired characteristics for real-world RL agents
P. Hernandez-Leal, and Y. Gao.
Research