2604.15741v1 Apr 17, 2026 cs.CL

대규모 언어 모델에서 순차적인 내부 분산 학습을 통한 불확실성 추론

Learning Uncertainty from Sequential Internal Dispersion in Large Language Models

A. Luu

Citations: 5,973

h-index: 34

Ponhvoan Srey

Citations: 4

h-index: 1

Xiaobao Wu

Shanghai Jiao Tong University

Citations: 1,110

h-index: 18

Cong-Duy Nguyen

Citations: 300

h-index: 10

불확실성 추정은 대규모 언어 모델(LLM)에서 환각 현상을 감지하는 유망한 방법론입니다. 최근 연구들은 주로 모델 내부 상태를 활용하여 불확실성을 추정하지만, 이는 레이어 간의 숨겨진 상태 변화에 대한 엄격한 가정을 필요로 하며, 마지막 또는 평균 토큰에만 집중함으로써 정보 손실을 야기합니다. 이러한 문제점을 해결하기 위해, 우리는 토큰별, 레이어별 특징을 활용하여 환각 현상을 감지하는 지도 학습 프레임워크인 Sequential Internal Variance Representation (SIVR)을 제안합니다. SIVR은 특정 레이어의 숨겨진 상태에 대한 구체적인 가정을 하기보다는, 불확실성이 내부 표현의 분산 또는 변동 정도에 나타난다는 보다 기본적인 가정을 채택하여, 모델 및 작업에 독립적인 방법을 제공합니다. 또한, SIVR은 각 토큰의 변동 특징을 전체 시퀀스로 집계하여 시간적 패턴을 학습함으로써 사실 오류를 나타내는 패턴을 파악하고 정보 손실을 방지합니다. 실험 결과는 SIVR이 강력한 기준 모델보다 일관되게 우수한 성능을 보임을 입증합니다. 더욱 중요한 것은, SIVR은 더 강력한 일반화 성능을 제공하며, 대규모 훈련 데이터 세트에 대한 의존성을 줄여 실용적인 활용 가능성을 높입니다. 저희의 코드 저장소는 다음 링크에서 확인하실 수 있습니다: https://github.com/ponhvoan/internal-variance.

Original Abstract

Uncertainty estimation is a promising approach to detect hallucinations in large language models (LLMs). Recent approaches commonly depend on model internal states to estimate uncertainty. However, they suffer from strict assumptions on how hidden states should evolve across layers, and from information loss by solely focusing on last or mean tokens. To address these issues, we present Sequential Internal Variance Representation (SIVR), a supervised hallucination detection framework that leverages token-wise, layer-wise features derived from hidden states. SIVR adopts a more basic assumption that uncertainty manifests in the degree of dispersion or variance of internal representations across layers, rather than relying on specific assumptions, which makes the method model and task agnostic. It additionally aggregates the full sequence of per-token variance features, learning temporal patterns indicative of factual errors and thereby preventing information loss. Experimental results demonstrate SIVR consistently outperforms strong baselines. Most importantly, SIVR enjoys stronger generalisation and avoids relying on large training sets, highlighting the potential for practical deployment. Our code repository is available online at https://github.com/ponhvoan/internal-variance.

0 Citations

0 Influential

37 Altmetric

185.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!