2604.14712v1 Apr 16, 2026 cs.AI

SGA-MCTS: 학습 없이 원자 경험 검색을 통한 계획과 실행 분리

SGA-MCTS: Decoupling Planning from Execution via Training-Free Atomic Experience Retrieval

Wuguannan Yao

Citations: 73

h-index: 4

Xiang Qi

Citations: 46

h-index: 4

Xinghong Xie

Citations: 33

h-index: 2

Peng Zhang

Citations: 39

h-index: 3

Dongyun Xue

Citations: 7

h-index: 1

Mingxiao Feng

Citations: 13

h-index: 2

Weng Zhou

Citations: 169

h-index: 7

Houqiang Li

Citations: 1,297

h-index: 20

LLM(대규모 언어 모델) 기반 시스템은 실제 문제 해결을 위해 복잡한 다단계 의사 결정 능력이 필요하지만, 현재의 계획 방법은 추론 시간의 높은 지연 시간과 지도 학습 미세 조정의 제한된 일반화 능력 사이의 상충 관계에 직면합니다. 이러한 한계를 해결하기 위해, 저희는 LLM 계획을 비매개변수 검색으로 표현하는 프레임워크인 **SGA-MCTS**를 소개합니다. 오프라인에서는 몬테카를로 트리 검색(MCTS)을 활용하여 솔루션 공간을 탐색하고, 고품질의 실행 경로를 State-Goal-Action (SGA) 원자로 추출합니다. 이러한 원자는 구체적인 개체를 상징적인 슬롯으로 추상화하여 재사용 가능한 인과적 논리를 유지하면서 도메인별 노이즈를 제거하는 비-어휘적 기본 단위입니다. 온라인에서는 검색 증강 에이전트가 하이브리드 상징-의미 메커니즘을 사용하여 관련 SGA를 검색하고, 현재 컨텍스트에 맞게 이를 부드러운 추론 힌트로 재구성합니다. 복잡한 벤치마크에서 얻은 실험 결과는 이 패러다임이 특정 작업에 대한 미세 조정 없이도 최첨단 시스템(예: GPT-5)의 성능과 일치하는, 고정된 오픈 웨이트 모델을 가능하게 한다는 것을 보여줍니다. SGA-MCTS는 검색의 상당한 계산 비용을 효과적으로 분산하여, System 2 수준의 추론 깊이를 System 1 수준의 추론 속도로 달성함으로써, 자율적인 계획을 확장 가능하고 실시간으로 구현할 수 있도록 합니다.

Original Abstract

LLM-powered systems require complex multi-step decision-making abilities to solve real-world tasks, yet current planning approaches face a trade-off between the high latency of inference-time search and the limited generalization of supervised fine-tuning. To address this limitation, we introduce \textbf{SGA-MCTS}, a framework that casts LLM planning as non-parametric retrieval. Offline, we leverage Monte Carlo Tree Search (MCTS) to explore the solution space and distill high-fidelity trajectories into State-Goal-Action (SGA) atoms. These atoms are de-lexicalized primitives that abstract concrete entities into symbolic slots, preserving reusable causal logic while discarding domain-specific noise. Online, a retrieval-augmented agent employs a hybrid symbolic-semantic mechanism to fetch relevant SGAs and re-ground them into the current context as soft reasoning hints. Empirical results on complex benchmarks demonstrate that this paradigm enables frozen, open-weights models to match the performance of SOTA systems (e.g., GPT-5) without task-specific fine-tuning. By effectively amortizing the heavy computational cost of search, SGA-MCTS achieves System 2 reasoning depth at System 1 inference speeds, rendering autonomous planning both scalable and real-time feasible.

0 Citations

0 Influential

10 Altmetric

50.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!