2604.04565v1 Apr 06, 2026 cs.CL

PassiveQA: 지도 학습 기반 미세 조정(Supervised Finetuning)을 통한 지식적(Epistemic)으로 조정된 질문 답변을 위한 세 가지 동작 프레임워크

PassiveQA: A Three-Action Framework for Epistemically Calibrated Question Answering via Supervised Finetuning

Citations: 0

h-index: 0

대규모 언어 모델(LLM)은 질문 답변 및 검색 증강 생성(RAG)에서 뛰어난 성능을 보이지만, 사용자의 질문이 완전히 명확하고 답변 가능한 상태라고 암묵적으로 가정합니다. 실제 환경에서 질문은 종종 불완전하거나 모호하며, 중요한 변수가 누락되어 모델이 과신하거나 환각적인 답변을 생성하는 경우가 발생합니다. 본 연구에서는 불완전한 정보 하에서 의사 결정 능력을 갖춘 질문 해결 방법을 연구하며, 모델이 답변, 추가 질문 요청, 또는 답변 거부 중 어떤 행동을 할지 결정해야 합니다. 우리는 표준 RAG 시스템과 개선된 RAG 시스템 모두 이러한 지식적 인식을 안정적으로 나타내지 못하며, 정보가 부족한 경우에도 답변 생성을 기본으로 한다는 것을 확인했습니다. 이러한 문제를 해결하기 위해, 우리는 정보의 충분성을 고려하여 모델의 행동을 조정하는 세 가지 동작 프레임워크인 PassiveQA를 제안합니다. 우리의 접근 방식은 구조화된 정보 상태 표현, 지식 그래프 기반의 컨텍스트, 그리고 누락된 변수와 의사 결정 추론을 명시적으로 모델링하는 미세 조정된 계획기를 통합합니다. 다양한 질문 답변 데이터 세트에서의 실험 결과, 미세 조정된 계획기는 매크로 F1 점수와 답변 거부율을 크게 향상시키고 환각 현상 발생률을 감소시키는 것을 확인했습니다. 이러한 결과는 지식적 의사 결정이 추론 시에 강제로 적용되는 것이 아니라, 학습 과정에서 학습되어야 한다는 강력한 경험적 증거를 제공합니다.

Original Abstract

Large Language Models (LLMs) have achieved strong performance in question answering and retrieval-augmented generation (RAG), yet they implicitly assume that user queries are fully specified and answerable. In real-world settings, queries are often incomplete, ambiguous, or missing critical variables, leading models to produce overconfident or hallucinated responses. In this work, we study decision-aware query resolution under incomplete information, where a model must determine whether to Answer, Ask for clarification, or Abstain. We show that standard and enhanced RAG systems do not reliably exhibit such epistemic awareness, defaulting to answer generation even when information is insufficient. To address this, we propose PassiveQA, a three-action framework that aligns model behaviour with information sufficiency through supervised finetuning. Our approach integrates structured information-state representations, knowledge graph-grounded context, and a finetuned planner that explicitly models missing variables and decision reasoning. Experiments across multiple QA datasets show that the finetuned planner achieves significant improvements in macro F1 and abstention recall while reducing hallucination rates, under a compute-constrained training regime. These results provide strong empirical evidence that epistemic decision-making must be learned during training rather than imposed at inference time.

0 Citations

0 Influential

0 Altmetric

0.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!