2604.25313v1 Apr 28, 2026 cs.CL

Faithfulness-QA: 맥락에 충실한 RAG 모델 학습을 위한 반사실적 개체 대체 데이터셋

Faithfulness-QA: A Counterfactual Entity Substitution Dataset for Training Context-Faithful RAG Models

Li Ju

Citations: 8

h-index: 2

Junzhe Wang

Citations: 2,355

h-index: 7

Qi Zhang

Citations: 2

h-index: 1

검색 증강 생성(RAG) 모델은 종종 검색된 맥락이 아닌, 모델 내부의 파라미터 메모리에 기반한 답변을 생성하여, 검색 증강의 핵심적인 장점을 훼손합니다. 이러한 불충실성을 해결하는 데 있어, 모델이 내부 지식보다 맥락을 우선하도록 명시적으로 요구하는 학습 데이터의 부족이 근본적인 장애물입니다. 본 논문에서는 반사실적 개체 대체 방식을 통해 구축된 99,094개의 샘플로 구성된 대규모 데이터셋인 Faithfulness-QA를 소개합니다. SQuAD 및 TriviaQA라는 두 가지 기존의 추출적 질의응답 벤치마크를 기반으로, 각 맥락에서 답변과 관련된 명명 개체를 자동으로 식별하고, 76,953개의 개체로 구성된 큐레이션된 데이터베이스에서 유형 일관성을 갖는 대체 개체로 이를 대체하여, 맥락과 파라미터 메모리 간의 통제된 지식 충돌을 생성합니다. 엄격한 품질 필터링을 통해 무작위로 추출된 200개 샘플에 대한 4가지 자동 검사에서 100%의 합격률을 보장합니다. 본 데이터셋, 구축 파이프라인, 그리고 8가지 명명 개체 범주를 포함하는 유형화된 개체 데이터베이스를 공개합니다. Faithfulness-QA는 어텐션 기반의 충실성 목표를 위한 학습 리소스로, 또한 RAG 시스템에서 맥락 기반 행동을 측정하기 위한 평가 벤치마크로 설계되었습니다. 데이터 및 코드는 https://github.com/qzhangFDU/faithfulness-qa-dataset 에서 확인할 수 있습니다.

Original Abstract

Retrieval-Augmented Generation (RAG) models frequently produce answers grounded in parametric memory rather than the retrieved context, undermining the core promise of retrieval augmentation. A fundamental obstacle to fixing this unfaithfulness is the lack of training data that explicitly requires models to prefer context over internal knowledge. We introduce Faithfulness-QA, a large-scale dataset of 99,094 samples constructed through counterfactual entity substitution. Starting from two established extractive QA benchmarks--SQuAD and TriviaQA--we automatically identify answer-bearing named entities in each context, replace them with type-consistent alternatives drawn from a curated bank of 76,953 entities, and thereby manufacture controlled knowledge conflicts between context and parametric memory. Rigorous quality filtering ensures 100% pass rates across four automated checks on random 200-sample audits. We release the full dataset, the construction pipeline, and a typed entity bank covering eight named entity categories. Faithfulness-QA is designed as a training resource for attention-based faithfulness objectives and as an evaluation benchmark for measuring context-grounding behavior in RAG systems. Data and code are available at https://github.com/qzhangFDU/faithfulness-qa-dataset.

0 Citations

0 Influential

28.993061443341 Altmetric

145.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!