2604.17884v1 Apr 20, 2026 cs.AI

SPREG: 엔트로피 기반 실시간 개입을 통한 구조화된 계획 수정을 활용한 대규모 언어 모델 추론

SPREG: Structured Plan Repair with Entropy-Guided Test-Time Intervention for Large Language Model Reasoning

Wei Lin

Citations: 195

h-index: 7

Shuaiting Chen

Citations: 161

h-index: 6

Yu Ming

Citations: 0

h-index: 0

Xinyue Yu

Citations: 9

h-index: 1

Wenjie Wang

Citations: 15

h-index: 3

Xuan Wang

Citations: 42

h-index: 2

Xinhao Zhong

Citations: 51

h-index: 5

대규모 언어 모델(LLM)은 장기간의 추론 과정에서 논리적 오류와 확률적 변동에 취약합니다. Classifier-Free Guidance (CFG)는 명령어 준수도를 향상시킬 수 있지만, 일반적인 정적 구현 방식은 의미 희석 및 언어적 품질 저하를 초래하는 경우가 많습니다. 본 논문에서는 SPREG (Structured Plan-guided Real-time Entropy Gating)이라는 가벼운 추론 시간 프레임워크를 제안합니다. SPREG은 실시간 엔트로피를 모니터링하는 적응형 이중 임계값 메커니즘을 사용하여, 급격한 "엔트로피 급증"을 논리적 오류의 신뢰할 수 있는 지표로 식별합니다. 오류가 감지되면, SPREG은 불필요한 기본값을 과거의 높은 신뢰도 상태에서 생성된 참조 분포로 대체하여 동적인 수정을 수행합니다. SPREG은 구조화된 추론 단계(예: 행동, 관찰)에 따라 가이드 강도를 조절하여 모델을 안정적인 상태로 되돌리면서도 유창성을 유지합니다. 실험 결과, SPREG은 AIME25 데이터셋에서 20.0%의 절대적인 정확도 향상을 보여주었으며, 복잡한 작업에서 발생하는 제어되지 않은 엔트로피 드리프트를 효과적으로 억제했습니다.

Original Abstract

Large Language Models (LLMs) are prone to logical hallucinations and stochastic drifts during long-chain reasoning. While Classifier-Free Guidance (CFG) can improve instruction adherence, standard static implementations often cause semantic dilution and linguistic degradation. We propose SPREG (Structured Plan-guided Real-time Entropy Gating), a lightweight inference-time framework for surgical error rectification. SPREG employs an adaptive dual-threshold mechanism to monitor real-time entropy, identifying sudden ``entropy spikes'' as reliable indicators of logical failure. Upon detection, it triggers a dynamic repair by replacing uninformative null-priors with reference distributions synthesized from historical high-confidence states. By modulating guidance intensity according to structured reasoning stages (e.g., Action, Observation), SPREG steers the model back to a stable manifold without compromising fluency. Our experiments demonstrate significant gains, notably a 20.0% absolute accuracy improvement on AIME25, while effectively suppressing uncontrolled entropy drift in complex tasks.

0 Citations

0 Influential

3.5 Altmetric

17.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!