2603.05471v1 Mar 05, 2026 cs.CL

LLM의 파라미터 지식을 활용한 검색 기반 사실 검증 방식

Leveraging LLM Parametric Knowledge for Fact Checking without Retrieval

Artem Vazhentsev

Independent Researcher

Citations: 528

h-index: 9

Maria Marina

Citations: 26

h-index: 3

Daniil Moskovskiy

Citations: 124

h-index: 5

Sergey Pletenev

Citations: 79

h-index: 5

Mikhail Seleznyov

Citations: 64

h-index: 4

M. Salnikov

Citations: 266

h-index: 6

Elena Tutubalina

Citations: 62

h-index: 5

Vasily Konovalov

Citations: 67

h-index: 4

Irina Nikishina

Citations: 25

h-index: 3

Alexander Panchenko

Citations: 19

h-index: 2

Viktor Moskvoretskii

Citations: 24

h-index: 3

대규모 언어 모델(LLM)을 기반으로 구축된 지능형 AI 시스템에서 신뢰성은 핵심적인 연구 과제입니다. 신뢰성을 높이기 위해, 인간이 작성한 텍스트, 웹 콘텐츠, 모델 출력 등 다양한 출처에서 얻은 자연어 주장을 사실 여부를 확인하는 과정에서, 외부 지식을 검색하고 LLM을 사용하여 주장이 검색된 증거에 얼마나 충실한지를 검증하는 방법이 일반적으로 사용됩니다. 하지만 이러한 방법은 검색 오류와 외부 데이터 가용성에 의해 제약되며, 모델의 내재적인 사실 검증 능력을 충분히 활용하지 못합니다. 본 연구에서는 검색 없이 사실을 검증하는 방법을 제안하며, 출처에 관계없이 임의의 자연어 주장을 검증하는 데 중점을 둡니다. 이 연구 설정을 탐구하기 위해, 일반화 성능을 평가하는 포괄적인 평가 프레임워크를 소개하며, (i) 희귀 지식, (ii) 주장 출처의 다양성, (iii) 다국어 지원, (iv) 장문 생성에 대한 견고성을 테스트합니다. 9개의 데이터셋, 18개의 방법, 3개의 모델을 사용하여 실험한 결과, 로짓 기반 접근 방식은 내부 모델 표현을 활용하는 방식에 비해 종종 성능이 낮다는 것을 확인했습니다. 이러한 결과를 바탕으로, 내부 표현 간의 상호 작용을 활용하는 INTRA라는 방법을 제안했으며, 뛰어난 일반화 성능을 보이는 최첨단 수준의 결과를 얻었습니다. 더 넓은 관점에서, 본 연구는 검색 기반 프레임워크를 보완하고, 확장성을 향상시키며, 이러한 시스템을 훈련 과정에서의 보상 신호 또는 생성 프로세스에 통합되는 구성 요소로 활용할 수 있도록, 검색 없이 사실을 검증하는 연구 방향의 잠재력을 보여줍니다.

Original Abstract

Trustworthiness is a core research challenge for agentic AI systems built on Large Language Models (LLMs). To enhance trust, natural language claims from diverse sources, including human-written text, web content, and model outputs, are commonly checked for factuality by retrieving external knowledge and using an LLM to verify the faithfulness of claims to the retrieved evidence. As a result, such methods are constrained by retrieval errors and external data availability, while leaving the models intrinsic fact-verification capabilities largely unused. We propose the task of fact-checking without retrieval, focusing on the verification of arbitrary natural language claims, independent of their source. To study this setting, we introduce a comprehensive evaluation framework focused on generalization, testing robustness to (i) long-tail knowledge, (ii) variation in claim sources, (iii) multilinguality, and (iv) long-form generation. Across 9 datasets, 18 methods and 3 models, our experiments indicate that logit-based approaches often underperform compared to those that leverage internal model representations. Building on this finding, we introduce INTRA, a method that exploits interactions between internal representations and achieves state-of-the-art performance with strong generalization. More broadly, our work establishes fact-checking without retrieval as a promising research direction that can complement retrieval-based frameworks, improve scalability, and enable the use of such systems as reward signals during training or as components integrated into the generation process.

1 Citations

0 Influential

4.5 Altmetric

23.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!