2605.28044v1 May 27, 2026 cs.AI

Relevant Is Not Warranted: Evidence-Force Calibration for Cited RAG

Yihang Chen
Yihang Chen
Citations: 215
h-index: 3
Xinpeng Wei
Xinpeng Wei
Citations: 81
h-index: 5
Sipeng Zhang
Sipeng Zhang
Citations: 7
h-index: 2
Pinyan Qian
Pinyan Qian
Citations: 7
h-index: 2
Shuhua Lin
Shuhua Lin
Citations: 42
h-index: 3
Su Wang
Su Wang
Citations: 142
h-index: 5
Xiaoyuan Wang
Xiaoyuan Wang
Citations: 47
h-index: 3
Wenxuan Xu
Wenxuan Xu
Citations: 98
h-index: 6
Qi Yu
Qi Yu
Citations: 20
h-index: 3
Junxian You
Junxian You
Citations: 43
h-index: 3

Cited RAG evaluation often treats visible sources as a grounding signal, but a real, topically relevant citation can still under-warrant the attached wording. We study this diagnostic failure as citation laundering: a related source is presented as warrant for an over-strong claim. We introduce FORCEBENCH, a contrastive stress test for evidence-force calibration. Each item holds a cited passage fixed and pairs an evidence-calibrated claim with a localized force-raised variant across five operational axes: relation, modality, scope, temporal validity, and numeric specificity. A calibrated evaluator should score the evidence-calibrated claim higher. Headline experiments use a fixed, locality-filtered 198-pair evaluation set. A citation-presence sanity check is uninformative by design; token and entity overlap still violate monotonicity on 32.8--36.4% of pairs. Across four reported model judges, standard generic support prompting is insufficient for this force-calibration stress test (aggregate MVR 47.2%), while explicit warrant-strength prompting lowers MVR to 24.5% but remains imperfect. We release the benchmark, prompts, outputs, and plug-in pipeline so citation evaluators can report monotonicity violation rate and force sensitivity alongside conventional support metrics.

3 Citations
0 Influential
3 Altmetric
18.0 Score
Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

Log in to request an AI analysis.

댓글

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!