2603.12740v1 Mar 13, 2026 cs.AI

ToolTree: 듀얼 피드백 몬테카를로 트리 탐색 및 양방향 가지치기를 이용한 효율적인 LLM 에이전트 도구 계획

ToolTree: Efficient LLM Agent Tool Planning via Dual-Feedback Monte Carlo Tree Search and Bidirectional Pruning

Shuhe Wang

Citations: 17

h-index: 2

Soyeon Caren Han

Citations: 30

h-index: 3

Eduard H. Hovy

Citations: 36

h-index: 2

Shuo Yang

Citations: 50

h-index: 4

Yihao Ding

Citations: 272

h-index: 8

대규모 언어 모델(LLM) 에이전트는 다양한 도메인의 다양한 외부 도구와의 상호 작용을 필요로 하는 복잡하고 다단계 작업에 점점 더 많이 활용되고 있습니다. 그러나 현재 LLM 에이전트 도구 계획 방법은 일반적으로 사전 예측 없이 즉각적으로 도구를 선택하는 전략에 의존하며, 이는 미래를 고려하지 못하고 도구 간의 의존성을 고려하지 못합니다. 본 논문에서는 몬테카를로 트리 탐색에서 영감을 얻은 새로운 도구 계획 패러다임인 ToolTree를 제시합니다. ToolTree는 듀얼 스테이지 LLM 평가 및 양방향 가지치기 메커니즘을 사용하여 가능한 도구 사용 경로를 탐색하며, 에이전트가 광범위한 도구 사용 시퀀스에서 정보에 입각한 적응적 결정을 내릴 수 있도록 합니다. 또한 도구 실행 전후에 유망하지 않은 경로를 가지치워 효율성을 높입니다. 4가지 벤치마크에서 개방형 및 폐쇄형 도구 계획 작업을 모두 수행한 실험 결과, ToolTree는 일관되게 성능을 향상시키면서도 가장 높은 효율성을 유지하며, 최첨단 계획 패러다임과 비교하여 평균 10%의 성능 향상을 달성했습니다.

Original Abstract

Large Language Model (LLM) agents are increasingly applied to complex, multi-step tasks that require interaction with diverse external tools across various domains. However, current LLM agent tool planning methods typically rely on greedy, reactive tool selection strategies that lack foresight and fail to account for inter-tool dependencies. In this paper, we present ToolTree, a novel Monte Carlo tree search-inspired planning paradigm for tool planning. ToolTree explores possible tool usage trajectories using a dual-stage LLM evaluation and bidirectional pruning mechanism that enables the agent to make informed, adaptive decisions over extended tool-use sequences while pruning less promising branches before and after the tool execution. Empirical evaluations across both open-set and closed-set tool planning tasks on 4 benchmarks demonstrate that ToolTree consistently improves performance while keeping the highest efficiency, achieving an average gain of around 10\% compared to the state-of-the-art planning paradigm.

12 Citations

1 Influential

4 Altmetric

34.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!