2604.04418v1 Apr 06, 2026 cs.HC

정당한가, 아니면 그저 설득력 있는가? LLM 품질의 차원으로서의 오류 검증 가능성

Justified or Just Convincing? Error Verifiability as a Dimension of LLM Quality

Bang Liu

Citations: 70

h-index: 3

Xiaoyuan Zhu

Citations: 422

h-index: 5

K. Truong

Citations: 13

h-index: 2

Riccardo Fogliato

Citations: 48

h-index: 3

Gokul Swamy

Citations: 48

h-index: 2

Weijia Zhang

Citations: 28

h-index: 1

Minglai Yang

University of Arizona

Citations: 34

h-index: 3

Minghao Liu

Citations: 15

h-index: 2

Andrew Ilyas

Citations: 1,058

h-index: 12

Steven Wu

Citations: 12

h-index: 2

Long Ye

Citations: 3

h-index: 1

LLM이 고위험 환경에 적용됨에 따라, 사용자는 개별 응답의 정확성을 판단해야 하며, 종종 모델이 생성한 근거 자료, 예를 들어 추론 체인 또는 설명에 의존합니다. 그러나 이러한 근거 자료가 사용자가 정답과 오답을 구별하는 데 도움이 되는지 여부를 측정하는 표준적인 방법은 존재하지 않습니다. 우리는 이 개념을 '오류 검증 가능성'으로 공식화하고, 인간 평가자의 높은 동의를 기준으로 검증된 균형 잡힌 지표인 $v_{ ext{bal}}$을 제안합니다. 이 지표는 근거 자료가 평가자가 응답의 정확성을 정확하게 평가하는 데 얼마나 도움이 되는지를 측정합니다. 연구 결과, 사후 훈련이나 모델 확장과 같은 일반적인 접근 방식, 또는 특정 개입 방법은 검증 가능성을 향상시키지 못했습니다. 우리는 두 가지 방법을 제안했는데, 이는 검증 가능성을 향상시키는 데 성공했습니다. 첫째는 수학적 추론을 위한 '반성 및 재구성(RR)' 방법이고, 둘째는 사실 기반 질의응답을 위한 '오라클 재구성(OR)' 방법입니다. 두 방법 모두 도메인에 적합한 외부 정보를 통합하여 검증 가능성을 향상시킵니다. 종합적으로, 우리의 연구 결과는 오류 검증 가능성이 단순히 정확도 향상에서 비롯되는 것이 아니라, 별도의 응답 품질 차원이며, 이를 해결하기 위해서는 특수한 도메인 지식이 필요한 방법이 필요하다는 것을 보여줍니다.

Original Abstract

As LLMs are deployed in high-stakes settings, users must judge the correctness of individual responses, often relying on model-generated justifications such as reasoning chains or explanations. Yet, no standard measure exists for whether these justifications help users distinguish correct answers from incorrect ones. We formalize this idea as error verifiability and propose $v_{\text{bal}}$, a balanced metric that measures whether justifications enable raters to accurately assess answer correctness, validated against human raters who show high agreement. We find that neither common approaches, such as post-training and model scaling, nor more targeted interventions recommended improve verifiability. We introduce two methods that succeed at improving verifiability: reflect-and-rephrase (RR) for mathematical reasoning and oracle-rephrase (OR) for factual QA, both of which improve verifiability by incorporating domain-appropriate external information. Together, our results establish error verifiability as a distinct dimension of response quality that does not emerge from accuracy improvements alone and requires dedicated, domain-aware methods to address.

0 Citations

0 Influential

6 Altmetric

30.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!