2603.14799v1 Mar 16, 2026 cs.LG

유니버스 라우팅: 자기 진화 에이전트가 인식적 제어를 필요로 하는 이유

Universe Routing: Why Self-Evolving Agents Need Epistemic Control

Citations: 3

h-index: 1

현재의 지속 학습 에이전트의 주요 문제점은 지식 부족이 아니라, 어떻게 추론할지를 결정하는 능력의 부족입니다. 에이전트가 '이 동전은 공정한가?'라는 질문에 직면했을 때, 빈도주의 가설 검정 또는 베이즈 사후 추론과 같은 인식적으로 호환되지 않는 프레임워크 중 어떤 것을 사용할지 결정해야 합니다. 이러한 프레임워크를 혼합하면 사소한 오류가 발생하는 것이 아니라, 의사 결정 체인 전체에 걸쳐 파급되는 구조적 오류가 발생합니다. 우리는 이를 '유니버스 라우팅' 문제로 공식화했습니다. 즉, 특수 솔버를 호출하기 전에 질문을 상호 배타적인 신념 공간으로 분류하는 것입니다. 우리의 주요 결과는 기존의 가정에 도전합니다. (1) 이질적인 솔버로의 강제 라우팅은 소프트 MoE의 정확도와 일치하지만, 인식적으로 호환되지 않는 프레임워크는 의미 있게 평균화될 수 없으므로 7배 더 빠릅니다. (2) 4억 6500만 개의 파라미터를 가진 라우터는 키워드 매칭 기반의 기존 방법보다 일반화 성능 격차를 2.3배 줄여, 표면적인 수준이 아닌 의미적인 추론을 수행함을 나타냅니다. (3) 새로운 신념 공간으로 확장할 때, 반복 기반의 지속 학습은 0%의 망각을 달성하여 EWC보다 75% 더 높은 성능을 보이며, 이는 모듈식 인식적 아키텍처가 정규화 기반 접근 방식보다 지속 학습에 근본적으로 더 적합함을 시사합니다. 이러한 결과는 더 광범위한 아키텍처 원칙을 제시합니다. 즉, 안정적인 자기 진화 에이전트는 추론 프레임워크 선택을 관리하는 명시적인 인식적 제어 계층이 필요할 수 있습니다.

Original Abstract

A critical failure mode of current lifelong agents is not lack of knowledge, but the inability to decide how to reason. When an agent encounters "Is this coin fair?" it must recognize whether to invoke frequentist hypothesis testing or Bayesian posterior inference - frameworks that are epistemologically incompatible. Mixing them produces not minor errors, but structural failures that propagate across decision chains. We formalize this as the universe routing problem: classifying questions into mutually exclusive belief spaces before invoking specialized solvers. Our key findings challenge conventional assumptions: (1) hard routing to heterogeneous solvers matches soft MoE accuracy while being 7x faster because epistemically incompatible frameworks cannot be meaningfully averaged; (2) a 465M-parameter router achieves a 2.3x smaller generalization gap than keyword-matching baselines, indicating semantic rather than surface-level reasoning; (3) when expanding to new belief spaces, rehearsal-based continual learning achieves zero forgetting, outperforming EWC by 75 percentage points, suggesting that modular epistemic architectures are fundamentally more amenable to lifelong learning than regularization-based approaches. These results point toward a broader architectural principle: reliable self-evolving agents may require an explicit epistemic control layer that governs reasoning framework selection.

0 Citations

0 Influential

0.5 Altmetric

2.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!