Generative models can be used in planning to propose targets corresponding to states that agents deem either likely or advantageous to experience. However, imperfections, common in learned models, lead to infeasible hallucinated targets, which can cause delusional behaviors and thus safety concerns. This work first categorizes and investigates the properties of several kinds of infeasible targets. Then, we devise a strategy to reject infeasible targets with a generic target evaluator, which trains alongside planning agents as an add-on without the need to change the behavior nor the architectures of the agent (and the generative model) it is attached to. We highlight that, without proper design, the evaluator can produce delusional estimates, rendering the strategy futile. Thus, to learn correct evaluations of infeasible targets, we propose to use a combination of learning rule, architecture, and two assistive hindsight relabeling strategies. Our experiments validate significant reductions in delusional behaviors and performance improvements for several kinds of existing planning agents.
Bibtex
@misc{zhao2025rejectinghallucinatedstatetargets,
title={Rejecting Hallucinated State Targets during Planning},
author={Mingde Zhao and Tristan Sylvain and Romain Laroche and Doina Precup and Yoshua Bengio},
year={2025},
eprint={2410.07096},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2410.07096},
}
Related Research
-
Scalable Temporal Domain Generalization via Prompting
Scalable Temporal Domain Generalization via Prompting
S. Hosseini, M. Zhai, H. Hajimirsadeghi, and F. Tung. Workshop at International Conference on Machine Learning (ICML)
Publications
-
Accurate Parameter-Efficient Test-Time Adaptation for Time Series Forecasting
Accurate Parameter-Efficient Test-Time Adaptation for Time Series Forecasting
H. R. Medeiros, H. Sharifi, G. Oliveira, and S. Irandoust. Workshop at International Conference on Machine Learning (ICML)
Publications
-
TabReason: A Reinforcement Learning-Enhanced LLM for Accurate and Explainable Tabular Data Prediction
TabReason: A Reinforcement Learning-Enhanced LLM for Accurate and Explainable Tabular Data Prediction
*T. Xu, *Z. Zhang, *X. Sun, *L. K. Zung, *H. Hajimirsadeghi, and G. Mori. Workshop at International Conference on Machine Learning (ICML)
Publications