2604.19149v1 Apr 21, 2026 cs.CL

질문 답변 토큰은 추론 과정을 어떻게 읽을까요? 양적 추론을 위한 사고형 LLM에서의 자기-읽기 패턴

How Do Answer Tokens Read Reasoning Traces? Self-Reading Patterns in Thinking LLMs for Quantitative Reasoning

Hao-Yuan Chen

Citations: 6

h-index: 1

Yi Liu

Citations: 110

h-index: 4

Tao Zhang

Citations: 36

h-index: 1

Chengfu Huo

Citations: 264

h-index: 6

Wei Hu

Citations: 51

h-index: 4

Jianzhi Shao

Citations: 35

h-index: 2

사고형 LLM은 답변을 생성하기 전에 추론 과정을 생성합니다. 기존의 활성화 제어 연구는 주로 이러한 추론 과정을 형성하는 데 초점을 맞추었습니다. 그러나 답변 토큰이 실제로 추론 과정을 어떻게 읽고 통합하여 신뢰할 수 있는 결과를 도출하는지에 대한 이해는 아직 부족합니다. 본 연구에서는 양적 추론에 초점을 맞춰, 답변-추론 어텐션(attention)을 분석하고, 정답에 해당하는 경우 추론 과정에 대한 읽기 초점이 순방향으로 이동하고 핵심 의미 지점에 지속적으로 집중되는 유익한 자기-읽기 패턴이 나타나는 것을 확인했습니다. 반면, 오답의 경우 확산적이고 불규칙한 어텐션 패턴을 보였습니다. 이는 답변 생성 과정에서 모델이 유효한 해결책 경로를 선택하고 핵심 증거를 통합하는 내부적인 확신 상태를 반영하는 것으로 해석됩니다. 이러한 분석을 바탕으로, 본 연구에서는 자기-읽기 품질(Self-Reading Quality, SRQ) 점수를 활용하는 제어 방식(steering method)을 제안합니다. SRQ는 프로세스 제어를 위한 기하학적 지표와 내용 모니터링을 위한 의미론적 지표를 결합하여 추론 과정을 유익한 자기-읽기로 유도하고, 불확실하고 체계적이지 않은 읽기를 방지하는 데 사용되는 데이터 선택을 통해 제어 벡터를 생성합니다. 실험 결과, 제안하는 방법은 일관된 정확도 향상을 보여주었습니다.

Original Abstract

Thinking LLMs produce reasoning traces before answering. Prior activation steering work mainly targets on shaping these traces. It remains less understood how answer tokens actually read and integrate the reasoning to produce reliable outcomes. Focusing on quantitative reasoning, we analyze the answer-to-reasoning attention and observe a benign self-reading pattern aligned with correctness, characterized by a forward drift of the reading focus along the reasoning trace and a persistent concentration on key semantic anchors, whereas incorrect solutions exhibit diffuse and irregular attention pattern. We interpret this as internal certainty during answer decoding, where the model commits to a viable solution branch and integrates key evidence. Following this, we propose a training-free steering method driven by Self-Reading Quality (SRQ) scores combining geometric metrics for process control with semantic metrics for content monitoring. SRQ selects data to build steering vectors that guide inference toward benign self-reading and away from uncertain and disorganized reading. Experiments show that our method yields consistent accuracy gains.

0 Citations

0 Influential

3 Altmetric

15.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!