2604.17837v1 Apr 20, 2026 cs.AI

다의미 전문가, 단의미 경로: MoE에서 라우팅을 통한 제어

Polysemantic Experts, Monosemantic Paths: Routing as Control in MoEs

Charles Ye

Citations: 0

h-index: 0

Bo Yuan

Citations: 14

h-index: 2

Lee Sharkey

Citations: 1,073

h-index: 2

LLM의 잔류 스트림은 상태이자 지시사항이며, 현재 컨텍스트를 인코딩하고 다음 변환을 결정합니다. 우리는 Mixture-of-Experts 모델을 위한 파라미터 없는 분해 방식을 도입합니다. 이 방식은 각 레이어의 숨겨진 상태를 라우팅을 유발하는 제어 신호와 라우터에 보이지 않는 직교적인 콘텐츠 채널로 분리합니다. 6가지 MoE 아키텍처에서, 모델은 콘텐츠 채널에서 표면 수준의 특징(언어, 토큰 식별자, 위치)을 유지하는 반면, 제어 신호는 레이어별로 변환되는 추상적인 함수를 인코딩합니다. 각 라우팅 결정이 낮은 대역폭을 가지므로, 이러한 정보 전달은 레이어 간의 조합적 전문화를 유도합니다. 개별 전문가들은 여전히 다의미를 유지하지만, 전문가 경로는 단의미가 되어, 언어와 표면 형태에 관계없이 토큰을 의미 기능에 따라 클러스터링합니다. 동일한 토큰(예: “:”)은 타입 어노테이션, 서론 콜론 또는 시간 구분 기호로 사용되는 경우, 서로 다른 경로를 따릅니다. 우리의 분해 방식은 이러한 구조의 원천을 밝힙니다. 제어 부분 공간의 클러스터는 전체 표현에 있는 클러스터보다 훨씬 더 단의미적입니다. 결과적으로, MoE에서 해석의 기본 단위는 전문가가 아닌 경로입니다.

Original Abstract

An LLM's residual stream is both state and instruction: it encodes the current context and determines the next transformation. We introduce a parameter-free decomposition for Mixture-of-Experts models that splits each layer's hidden state into a control signal that causally drives routing and an orthogonal content channel invisible to the router. Across six MoE architectures, we find that models preserve surface-level features (language, token identity, position) in the content channel, while the control signal encodes an abstract function that rotates from layer to layer. Because each routing decision is low-bandwidth, this hand-off forces compositional specialization across layers. While individual experts remain polysemantic, expert paths become monosemantic, clustering tokens by semantic function across languages and surface forms. The same token (e.g., ":") follows distinct trajectories depending on whether it serves as a type annotation, an introductory colon, or a time separator. Our decomposition identifies the source of this structure: clusters in the control subspace are substantially more monosemantic than those in the full representation. As a result, the natural unit of interpretability in MoEs is not the expert but the trajectory.

0 Citations

0 Influential

1 Altmetric

5.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!