2605.30219v1 May 28, 2026 cs.AI

When Should Models Change Their Minds? Contextual Belief Management in Large Language Models

Zongrui Li
Zongrui Li
Citations: 22
h-index: 1
Mengru Wang
Mengru Wang
Citations: 870
h-index: 15
Yunzhi Yao
Yunzhi Yao
Zhejiang University;Shandong University
Citations: 3,270
h-index: 22
Haoming Xu
Haoming Xu
Citations: 207
h-index: 4
Weihong Xu
Weihong Xu
Citations: 5
h-index: 1
Chiyu Wu
Chiyu Wu
Citations: 3,270
h-index: 4
Jingbo Shang
Jingbo Shang
Citations: 64
h-index: 3
Yujia Gong
Yujia Gong
Citations: 52
h-index: 2
Shumin Deng
Shumin Deng
Citations: 647
h-index: 7

Long-horizon interactions require language models to manage accumulating information: when to update their state, when to preserve their state, and what to ignore. We study this challenge as \textbf{Contextual Belief Management (CBM)}: maintaining a predicted belief state aligned with formal evidence while isolating task-irrelevant noise. To make CBM measurable, we introduce BeliefTrack, a closed-world benchmark spanning Rule Discovery and Circuit Diagnosis, where a finite belief space and symbolic verifiers enable exact turn-level evaluation. BeliefTrack diagnoses three failures: Failed Stay, Failed Update, and Failed Isolation. Across multiple LLMs, vanilla models exhibit severe CBM failures, while explicit belief-tracking prompts provide limited gains. In contrast, reinforcement learning with belief-state rewards reduces failure rates by 70.9\% on average. Further probing reveals latent belief-state dynamics behind these failures, and representation-level steering reduces failure rates by 46.1\% across two tasks\footnote{Code is coming soon at https://github.com/zjunlp/CBM.

1 Citations
0 Influential
31 Altmetric
156.0 Score
Original PDF
0

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

Log in to request an AI analysis.

댓글

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!