Contextual bandits have proven effective for building personalized recommender systems, yet they suffer from the cold-start prob- lem when little user interaction data is available. Recent work has shown that Large Language Models (LLMs) can help address this by simulating user preferences to warm-start bandits—a method known as Contextual Bandits with LLM Initialization (CBLI). While CBLI reduces early regret, it is unclear how robust the approach is to inaccuracies in LLM-generated preferences. In this paper, we ex- tend the CBLI framework to systematically evaluate its sensitivity to noisy LLM priors. We inject both random and label-flipping noise into the synthetic training data and measure how these affect cu- mulative regret across three tasks generated from conjoint-survey datasets. Our results show that CBLI is robust to random corruption but exhibits clear breakdown thresholds under preference-flipping warm-starting remains effective up to 30% corruption, loses its ad- vantage around 40%, and degrades performance beyond 50%. We further observe diminishing returns with larger synthetic datasets beyond a point, more data can reinforce bias rather than improve performance under noisy conditions. These findings offer practical insights for deploying LLM-assisted decision systems in real-world recommendation scenarios.
Related Research
-
Detecting Mule Account Fraud with Federated Learning
Detecting Mule Account Fraud with Federated Learning
Research
-
Scalable Temporal Domain Generalization via Prompting
Scalable Temporal Domain Generalization via Prompting
S. Hosseini, M. Zhai, H. Hajimirsadeghi, and F. Tung. Workshop at International Conference on Machine Learning (ICML)
Publications
-
Accurate Parameter-Efficient Test-Time Adaptation for Time Series Forecasting
Accurate Parameter-Efficient Test-Time Adaptation for Time Series Forecasting
H. R. Medeiros, H. Sharifi, G. Oliveira, and S. Irandoust. Workshop at International Conference on Machine Learning (ICML)
Publications