2603.13853v2 Mar 14, 2026 cs.CL

APEX-Searcher: 에이전트 기반 계획 및 실행을 통한 LLM의 검색 능력 향상

APEX-Searcher: Augmenting LLMs' Search Capabilities through Agentic Planning and Execution

Kun Chen

Citations: 3

h-index: 1

Qingchao Kong

Citations: 227

h-index: 8

Feifei Zhao

Citations: 14

h-index: 3

Wenji Mao

Citations: 122

h-index: 6

대규모 언어 모델(LLM)을 기반으로 하는 검색 증강 생성(RAG)은 다양한 분야에서 외부 지식을 검색하고 활용하는 데 중요한 접근 방식입니다. 복잡한 다단계 질문에 직면했을 때, 단일 라운드의 검색은 정확한 추론과 문제 해결에 종종 충분하지 않습니다. 복잡한 작업에 대한 검색 능력을 향상시키기 위해, 대부분의 기존 연구에서는 엔드 투 엔드 학습을 통해 다중 라운드 반복 검색과 추론 과정을 통합합니다. 이러한 접근 방식은 문제 해결 성능을 크게 향상시키지만, 여전히 작업 추론 및 모델 학습 측면에서 어려움이 있습니다. 특히, 엔드 투 엔드 강화 학습(RL) 과정에서 모호한 검색 실행 경로와 희소한 보상은 부정확한 검색 결과와 성능 저하를 초래할 수 있습니다. 이러한 문제를 해결하기 위해, 본 논문에서는 LLM의 검색 능력을 향상시키는 새로운 에이전트 기반 계획 및 실행 프레임워크인 APEX-Searcher를 제안합니다. 구체적으로, 우리는 검색 과정을 계획 및 실행으로 분리하는 두 단계의 에이전트 기반 프레임워크를 도입합니다. 먼저, 분해에 특화된 보상을 사용하여 전략적 계획을 최적화하는 강화 학습을 사용합니다. 그런 다음, 고품질의 다단계 경로에 대한 지도 학습을 적용하여 모델이 강력한 반복적인 하위 작업 실행 능력을 갖추도록 합니다. 광범위한 실험 결과, 제안하는 프레임워크는 여러 벤치마크에서 다단계 RAG 및 작업 계획 성능 모두에서 상당한 개선을 달성함을 보여줍니다.

Original Abstract

Retrieval-augmented generation (RAG), based on large language models (LLMs), serves as a vital approach to retrieving and leveraging external knowledge in various domain applications. When confronted with complex multi-hop questions, single-round retrieval is often insufficient for accurate reasoning and problem solving. To enhance search capabilities for complex tasks, most existing works integrate multi-round iterative retrieval with reasoning processes via end-to-end training. While these approaches significantly improve problem-solving performance, they are still faced with challenges in task reasoning and model training, especially ambiguous retrieval execution paths and sparse rewards in end-to-end reinforcement learning (RL) process, leading to inaccurate retrieval results and performance degradation. To address these issues, in this paper, we proposes APEX-Searcher, a novel Agentic Planning and Execution framework to augment LLM search capabilities. Specifically, we introduce a two-stage agentic framework that decouples the retrieval process into planning and execution: It first employs RL with decomposition-specific rewards to optimize strategic planning; Built on the sub-task decomposition, it then applies supervised fine-tuning on high-quality multi-hop trajectories to equip the model with robust iterative sub-task execution capabilities. Extensive experiments demonstrate that our proposed framework achieves significant improvements in both multi-hop RAG and task planning performances across multiple benchmarks.

0 Citations

0 Influential

4 Altmetric

20.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!