2602.01992v3 Feb 02, 2026 cs.AI

트랜스포머 모델에서 나타나는 유도적 유사 추론

Emergent Analogical Reasoning in Transformers

Gouki Minegishi

Citations: 99

h-index: 6

Jingyuan Feng

Citations: 9

h-index: 1

Hiroki Furuta

Citations: 2,472

h-index: 15

Takeshi Kojima

Citations: 7,106

h-index: 6

Yusuke Iwasawa

Citations: 10,187

h-index: 21

Yutaka Matsuo

Citations: 1,364

h-index: 11

유추는 인간 지능의 핵심 기능으로, 한 영역에서 발견된 추상적인 패턴을 다른 영역에 적용할 수 있도록 합니다. 인지 과정에서 중요한 역할을 하지만, 트랜스포머 모델이 어떻게 유사 추론을 습득하고 구현하는지에 대한 메커니즘은 아직 잘 이해되지 않았습니다. 본 연구에서는 범주론의 '함자' 개념에서 영감을 받아, 유사 추론을 범주 간의 개체 간의 대응 관계 추론으로 형식화했습니다. 이러한 형식화를 바탕으로, 통제된 환경에서 유사 추론의 발생을 평가하는 인공적인 작업을 도입했습니다. 연구 결과, 유사 추론의 발생은 데이터 특성, 최적화 선택, 모델 규모에 매우 민감하게 반응하는 것으로 나타났습니다. 메커니즘 분석을 통해, 트랜스포머 모델에서의 유사 추론이 다음 두 가지 핵심 요소로 구성된다는 것을 보여주었습니다. (1) 임베딩 공간에서의 관계 구조의 기하학적 정렬, 그리고 (2) 트랜스포머 내부에서의 함자 적용입니다. 이러한 메커니즘은 모델이 한 범주에서 다른 범주로 관계 구조를 전이하도록 하여, 유추를 실현합니다. 마지막으로, 이러한 효과를 정량화한 결과, 사전 훈련된 LLM에서도 동일한 경향이 관찰되었습니다. 이를 통해, 우리는 유추라는 추상적인 인지 개념을 현대 신경망에서 구체적이고 메커니즘적으로 설명되는 현상으로 발전시켰습니다.

Original Abstract

Analogy is a central faculty of human intelligence, enabling abstract patterns discovered in one domain to be applied to another. Despite its central role in cognition, the mechanisms by which Transformers acquire and implement analogical reasoning remain poorly understood. In this work, inspired by the notion of functors in category theory, we formalize analogical reasoning as the inference of correspondences between entities across categories. Based on this formulation, we introduce synthetic tasks that evaluate the emergence of analogical reasoning under controlled settings. We find that the emergence of analogical reasoning is highly sensitive to data characteristics, optimization choices, and model scale. Through mechanistic analysis, we show that analogical reasoning in Transformers decomposes into two key components: (1) geometric alignment of relational structure in the embedding space, and (2) the application of a functor within the Transformer. These mechanisms enable models to transfer relational structure from one category to another, realizing analogy. Finally, we quantify these effects and find that the same trends are observed in pretrained LLMs. In doing so, we move analogy from an abstract cognitive notion to a concrete, mechanistically grounded phenomenon in modern neural networks.

1 Citations

0 Influential

10.5 Altmetric

53.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!