Robustness of LLM-Initialized Bandits for Recommendation Under Noisy Priors - Publication

Contextual bandits have proven effective for building personalized recommender systems, yet they suffer from the cold-start prob- lem when little user interaction data is available. Recent work has shown that Large Language Models (LLMs) can help address this by simulating user preferences to warm-start bandits—a method known as Contextual Bandits with LLM Initialization (CBLI). While CBLI reduces early regret, it is unclear how robust the approach is to inaccuracies in LLM-generated preferences. In this paper, we ex- tend the CBLI framework to systematically evaluate its sensitivity to noisy LLM priors. We inject both random and label-flipping noise into the synthetic training data and measure how these affect cu- mulative regret across three tasks generated from conjoint-survey datasets. Our results show that CBLI is robust to random corruption but exhibits clear breakdown thresholds under preference-flipping warm-starting remains effective up to 30% corruption, loses its ad- vantage around 40%, and degrades performance beyond 50%. We further observe diminishing returns with larger synthetic datasets beyond a point, more data can reinforce bias rather than improve performance under noisy conditions. These findings offer practical insights for deploying LLM-assisted decision systems in real-world recommendation scenarios.

Related Research

Stochastic processes and SDEs

Stochastic processes and SDEs

S. Prince.

Research
Numerical methods for ODEs

Numerical methods for ODEs

S. Prince.

Research
Detecting Mule Account Fraud with Federated Learning

Detecting Mule Account Fraud with Federated Learning

Responsible AI

Research

Cookies Settings

Related Research