2602.00449v1 Jan 31, 2026 cs.AI

잠재적 CoT 모델은 단계적으로 사고하는가? 순차적 추론 과제에 대한 기계론적 연구

Do Latent-CoT Models Think Step-by-Step? A Mechanistic Study on Sequential Reasoning Tasks

Citations: 8

h-index: 2

Citations: 84

h-index: 4

잠재적 생각의 사슬(Latent-CoT)은 긴 근거를 출력하지 않고도 단계적인 연산을 수행하는 것을 목표로 하지만, 그 작동 메커니즘은 여전히 불분명하다. 본 연구에서는 엄격한 순차적 다항식 반복 과제를 대상으로 연속적 사고(continuous-thought) 교사-학생 증류 모델인 CODI를 분석한다. 로짓 렌즈(logit-lens) 디코딩, 선형 프로브(linear probes), 어텐션 분석, 활성화 패칭(activation patching)을 사용하여 중간 상태 표현의 위치를 파악하고 최종 판독(readout)까지의 경로를 추적한다. 2-홉 및 3-홉 과제에서 CODI는 잠재적 사고 위치 전반에 걸쳐 디코딩 가능한 전체 브리지 상태(bridge states) 집합을 형성하는 반면, 최종 입력은 별도의 거의 직접적인 경로를 따른다. 예측은 사고 종료 경계에서의 후기 융합(late fusion)을 통해 발생한다. 홉 길이가 더 긴 경우, CODI는 완전한 잠재적 롤아웃을 안정적으로 실행하지 못하며, 대신 후기 중간 상태에 집중하여 정답 판독 위치에서 이를 마지막 입력과 융합하는 부분적인 잠재적 추론 경로를 보인다. 애블레이션 연구는 이러한 부분적 경로가 더 어려운 최적화를 포함한 체제 변화(regime shifts) 하에서 붕괴될 수 있음을 보여준다. 결론적으로, 우리는 CODI 스타일의 잠재적 CoT가 충실한 반복 연산을 수행하는 경우와 압축되거나 지름길(shortcut) 전략을 사용하는 경우를 명확히 구분하고, 순차적 추론을 위한 강력한 잠재적 CoT 목표를 설계하는 데 따르는 과제를 강조한다.

Original Abstract

Latent Chain-of-Thought (Latent-CoT) aims to enable step-by-step computation without emitting long rationales, yet its mechanisms remain unclear. We study CODI, a continuous-thought teacher-student distillation model, on strictly sequential polynomial-iteration tasks. Using logit-lens decoding, linear probes, attention analysis, and activation patching, we localize intermediate-state representations and trace their routing to the final readout. On two- and three-hop tasks, CODI forms the full set of bridge states that become decodable across latent-thought positions, while the final input follows a separate near-direct route; predictions arise via late fusion at the end-of-thought boundary. For longer hop lengths, CODI does not reliably execute a full latent rollout, instead exhibiting a partial latent reasoning path that concentrates on late intermediates and fuses them with the last input at the answer readout position. Ablations show that this partial pathway can collapse under regime shifts, including harder optimization. Overall, we delineate when CODI-style latent-CoT yields faithful iterative computation versus compressed or shortcut strategies, and highlight challenges in designing robust latent-CoT objectives for sequential reasoning.

5 Citations

0 Influential

2 Altmetric

15.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!