2601.09465v1 Jan 14, 2026 cs.AI

EvoFSM: 유한 상태 기계를 활용한 심층 연구를 위한 제어 가능한 자가 진화

EvoFSM: Controllable Self-Evolution for Deep Research with Finite State Machines

Shuo Zhang

Citations: 79

h-index: 5

Chaofa Yuan

Citations: 2

h-index: 1

Ryan Guo

Citations: 1

h-index: 1

Xiaomin Yu

Citations: 15

h-index: 3

Rui Xu

Citations: 12

h-index: 2

Zhi Yang

Citations: 16

h-index: 2

Shuhao Guan

Citations: 26

h-index: 2

Zhenheng Tang

Citations: 28

h-index: 3

Sen Hu

Citations: 135

h-index: 6

Ronghao Chen

Citations: 130

h-index: 6

Huacan Wang

Citations: 82

h-index: 5

Liwen Zhang

Citations: 11

h-index: 2

Zhangquan Chen

Citations: 213

h-index: 8

Zinuo Li

Citations: 343

h-index: 10

LLM 기반 에이전트가 심층 연구 분야에서 가능성을 보여주었으나, 기존 접근 방식의 대부분은 실제 환경의 개방형 질의에 적응하기 어려운 고정된 워크플로우에 의존하고 있습니다. 이에 최근 연구들은 에이전트가 스스로 코드나 프롬프트를 재작성하여 문제 해결 능력을 향상시키는 자가 진화를 탐구하고 있지만, 제약 없는 최적화는 흔히 불안정성, 환각 현상, 지시 이탈을 초래합니다. 본 논문에서는 자유 형식의 재작성 대신 명시적인 유한 상태 기계(FSM)를 진화시켜 적응성과 제어력을 동시에 확보한 구조화된 자가 진화 프레임워크인 EvoFSM을 제안합니다. EvoFSM은 최적화 공간을 거시적인 '흐름(Flow, 상태 전이 로직)'과 미시적인 '기술(Skill, 상태별 행동)'로 분리하여 명확한 행동 경계 내에서 목표 지향적인 개선을 가능하게 합니다. 비평가(Critic) 메커니즘의 지도 하에 EvoFSM은 제약된 연산 집합을 통해 FSM을 개선하며, 성공적인 궤적은 재사용 가능한 사전 지식으로, 실패 패턴은 향후 질의에 대한 제약 조건으로 정제하는 자가 진화 메모리를 통합합니다. 5가지 멀티홉 QA 벤치마크에 대한 광범위한 평가를 통해 EvoFSM의 유효성을 입증했습니다. 특히 EvoFSM은 DeepSearch 벤치마크에서 58.0%의 정확도를 달성했습니다. 또한 상호작용적 의사결정 작업에 대한 추가 실험 결과는 모델의 일반화 성능을 뒷받침합니다.

Original Abstract

While LLM-based agents have shown promise for deep research, most existing approaches rely on fixed workflows that struggle to adapt to real-world, open-ended queries. Recent work therefore explores self-evolution by allowing agents to rewrite their own code or prompts to improve problem-solving ability, but unconstrained optimization often triggers instability, hallucinations, and instruction drift. We propose EvoFSM, a structured self-evolving framework that achieves both adaptability and control by evolving an explicit Finite State Machine (FSM) instead of relying on free-form rewriting. EvoFSM decouples the optimization space into macroscopic Flow (state-transition logic) and microscopic Skill (state-specific behaviors), enabling targeted improvements under clear behavioral boundaries. Guided by a critic mechanism, EvoFSM refines the FSM through a small set of constrained operations, and further incorporates a self-evolving memory that distills successful trajectories as reusable priors and failure patterns as constraints for future queries. Extensive evaluations on five multi-hop QA benchmarks demonstrate the effectiveness of EvoFSM. In particular, EvoFSM reaches 58.0% accuracy on the DeepSearch benchmark. Additional results on interactive decision-making tasks further validate its generalization.

1 Citations

0 Influential

5 Altmetric

26.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!