Although Reinforcement Learning (RL) has been one of the most successful approaches for learning in sequential decision making problems, the sample-complexity of RL techniques still represents a major challenge for practical applications. To combat this challenge, whenever a competent policy (e.g., either a legacy system or a human demonstrator) is available, the agent could leverage samples from this policy (advice) to improve sample-efficiency. However, advice is normally limited, hence it should ideally be directed to states where the agent is uncertain on the best action to execute. In this work, we propose Requesting Confidence-Moderated Policy advice (RCMP), an action-advising framework where the agent asks for advice when its epistemic uncertainty is high for a certain state. RCMP takes into account that the advice is limited and might be suboptimal. We also describe a technique to estimate the agent uncertainty by performing minor modifications in standard value-function-based RL methods. Our empirical evaluations show that RCMP performs better than Importance Advising, not receiving advice, and receiving it at random states in Gridworld and Atari Pong scenarios.
Bibtex
@inproceedings{da2020uncertainty,
title={Uncertainty-Aware Action Advising for Deep Reinforcement Learning Agents},
author={Da Silva, Felipe Leno and Hernandez-Leal, Pablo and Kartal, Bilal and Taylor, Matthew E},
booktitle={Thirty-Fourth AAAI Conference on Artificial Intelligence},
year={2020}
}
Related Research
-
Our NeurIPS 2021 Reading List
Our NeurIPS 2021 Reading List
Y. Cao, K. Y. C. Lui, T. Durand, J. He, P. Xu, N. Mehrasa, A. Radovic, A. Lehrmann, R. Deng, A. Abdi, M. Schlegel, and S. Liu.
Computer Vision; Data Visualization; Graph Representation Learning; Learning And Generalization; Natural Language Processing; Optimization; Reinforcement Learning; Time series Modelling; Unsupervised Learning
Research
-
Heterogeneous Multi-task Learning with Expert Diversity
Heterogeneous Multi-task Learning with Expert Diversity
G. Oliveira, and F. Tung.
Computer Vision; Natural Language Processing; Reinforcement Learning
Research
-
Desired characteristics for real-world RL agents
Desired characteristics for real-world RL agents
P. Hernandez-Leal, and Y. Gao.
Research