2605.26778v1 May 26, 2026 cs.AI

The Attribution Blind Spot: Detecting When Language Models Rely on Memory Rather Than Retrieved Context

Yujia Liu
Yujia Liu
Citations: 0
h-index: 0
Wenpeng Xing
Wenpeng Xing
Citations: 205
h-index: 10
Chenchen Ye
Chenchen Ye
Citations: 66
h-index: 3
Zhengtao Yu
Zhengtao Yu
Citations: 20
h-index: 2
Meng Han
Meng Han
Citations: 78
h-index: 3
Yunzhao Wei
Yunzhao Wei
Citations: 0
h-index: 0
Gaolei Li
Gaolei Li
Citations: 0
h-index: 0

Retrieval-augmented generation promises to ground language model outputs in external evidence, yet the field has no reliable way to verify whether retrieved context actually governs generation -- a prerequisite for any high-stakes deployment. The standard assumption, that context-consistent output implies context-governed output, breaks when the retrieved document overlaps with the model's pretraining data: the model can produce faithful-looking text entirely from parametric memory, and both pathways yield indistinguishable output. We name this failure the attribution blind spot and introduce Computational Reality Monitoring (CRM) to address it. CRM operationalizes a principle adapted from cognitive science's reality monitoring framework: comparing internal representations with and without context reveals membership-conditioned representational divergence that output-level monitors systematically miss. CRM does not certify which source an individual generation used; it detects whether pretraining exposure leaves a measurable internal trajectory signature, establishing a necessary substrate for source attribution. Across nine model variants spanning three families, this divergence concentrates in architecture-specific layer patterns, receives converging support from block-level noise intervention, and generalizes across tasks and datasets while collapsing on domain-confounded benchmarks. The attribution blind spot is measurable and partially addressable: internal representations carry a diagnostic signal invisible at the output level, establishing a foundation for systems whose internal awareness of evidence provenance governs their external behavior.

0 Citations
0 Influential
5 Altmetric
25.0 Score
Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

Log in to request an AI analysis.

댓글

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!