2602.02486v1 Feb 02, 2026 cs.CL

RE-TRAC: 심층 탐색 에이전트를 위한 재귀적 경로 압축

RE-TRAC: REcursive TRAjectory Compression for Deep Search Agents

Miaosen Zhang

Citations: 406

h-index: 6

Xin Geng

Citations: 29

h-index: 3

Baining Guo

Citations: 46

h-index: 3

Song Wang

Citations: 46

h-index: 3

Jialiang Zhu

Citations: 105

h-index: 4

Gongrui Zhang

Citations: 256

h-index: 3

Xiaolong Ma

Citations: 152

h-index: 7

Lin Xu

Citations: 832

h-index: 8

Ruiqi Yang

Citations: 9

h-index: 2

Kai Qiu

Citations: 856

h-index: 8

Zhirong Wu

Citations: 215

h-index: 3

Qi Dai

Citations: 156

h-index: 7

Rui Ma

Citations: 117

h-index: 3

Bei Liu

Citations: 14

h-index: 3

Yifan Yang

Citations: 107

h-index: 6

Chong Luo

Citations: 882

h-index: 10

Zhengyuan Yang

Citations: 9,496

h-index: 36

Linjie Li

Citations: 45

h-index: 3

Lijuan Wang

Citations: 6

h-index: 2

Weizhu Chen

Citations: 580

h-index: 5

LLM 기반의 심층 연구 에이전트는 주로 ReAct 프레임워크를 기반으로 구축됩니다. 이러한 선형적인 설계는 이전 상태를 재검토하거나, 대체 탐색 방향으로 분기하거나, 긴 문맥에서 전반적인 상황 인식을 유지하는 것을 어렵게 만들며, 이는 종종 지역 최적화, 중복 탐색 및 비효율적인 검색으로 이어집니다. 본 논문에서는 Re-TRAC이라는 에이전트 프레임워크를 제안합니다. Re-TRAC은 각 탐색 경로 이후에 구조화된 상태 표현을 생성하여 증거, 불확실성, 실패, 미래 계획 등을 요약하고, 이를 기반으로 후속 탐색 경로를 생성합니다. 이를 통해 반복적인 성찰과 전역적으로 정보에 입각한 계획이 가능해지며, 연구를 점진적인 과정으로 재정의할 수 있습니다. 실험 결과, Re-TRAC은 최첨단 LLM을 사용할 때 BrowseComp 벤치마크에서 ReAct보다 15-20% 더 우수한 성능을 보였습니다. 더 작은 모델의 경우, Re-TRAC에 대한 인식 기반의 지도 학습을 도입하여 동등한 규모에서 최첨단 성능을 달성했습니다. 주목할 점은, Re-TRAC은 반복 횟수가 증가함에 따라 도구 호출 및 토큰 사용량이 꾸준히 감소하며, 이는 중복 탐색이 아닌, 경로 간의 성찰을 통해 점진적으로 목표를 향한 탐색이 이루어지고 있음을 나타냅니다.

Original Abstract

LLM-based deep research agents are largely built on the ReAct framework. This linear design makes it difficult to revisit earlier states, branch into alternative search directions, or maintain global awareness under long contexts, often leading to local optima, redundant exploration, and inefficient search. We propose Re-TRAC, an agentic framework that performs cross-trajectory exploration by generating a structured state representation after each trajectory to summarize evidence, uncertainties, failures, and future plans, and conditioning subsequent trajectories on this state representation. This enables iterative reflection and globally informed planning, reframing research as a progressive process. Empirical results show that Re-TRAC consistently outperforms ReAct by 15-20% on BrowseComp with frontier LLMs. For smaller models, we introduce Re-TRAC-aware supervised fine-tuning, achieving state-of-the-art performance at comparable scales. Notably, Re-TRAC shows a monotonic reduction in tool calls and token usage across rounds, indicating progressively targeted exploration driven by cross-trajectory reflection rather than redundant search.

3 Citations

0 Influential

18 Altmetric

93.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!