2603.15594v1 Mar 16, 2026 cs.AI

OpenSeeker: 훈련 데이터를 완전히 공개하여 최첨단 검색 에이전트의 민주화를 실현하다

OpenSeeker: Democratizing Frontier Search Agents by Fully Open-Sourcing Training Data

Yuzhu Cai

Citations: 140

h-index: 6

Rui Ye

Citations: 67

h-index: 5

Shuo Tang

Citations: 277

h-index: 9

Xinyu Zhu

Citations: 48

h-index: 3

Yijun Lu

Citations: 23

h-index: 2

Siheng Chen

Citations: 244

h-index: 9

Yuwen Du

Citations: 70

h-index: 4

최첨단 대규모 언어 모델(LLM) 에이전트에게 검색 능력은 필수적인 요소가 되었지만, 고성능 검색 에이전트 개발은 투명하고 고품질의 훈련 데이터 부족으로 인해 여전히 대기업이 주도하고 있습니다. 이러한 데이터 부족은 연구 커뮤니티 전체의 발전과 혁신을 근본적으로 저해하고 있습니다. 이러한 격차를 해소하기 위해, 우리는 완전한 오픈 소스 검색 에이전트인 OpenSeeker를 소개합니다. OpenSeeker는 모델과 데이터 모두를 오픈 소스로 제공하며, 다음 두 가지 핵심 기술 혁신을 통해 최첨단 성능을 달성합니다. (1) 사실 기반의 확장 가능하고 제어 가능한 질의응답(QA) 생성: OpenSeeker는 토폴로지 확장 및 개체 난독화를 통해 웹 그래프를 역추적하여 제어 가능한 범위와 복잡성을 가진 복잡한 다단계 추론 작업을 생성합니다. (2) 노이즈 제거된 경로 생성: OpenSeeker는 과거 요약 메커니즘을 사용하여 경로의 노이즈를 제거하여, LLM이 고품질의 액션을 생성하도록 유도합니다. 실험 결과, OpenSeeker는 단 11.7k개의 합성 샘플로 훈련되었음에도 불구하고 BrowseComp, BrowseComp-ZH, xbench-DeepSearch, WideSearch 등 다양한 벤치마크에서 최고 성능을 달성했습니다. 특히, 간단한 지도 학습(SFT)으로 훈련된 OpenSeeker는 완전한 오픈 소스 에이전트인 DeepDive보다 훨씬 우수한 성능을 보였습니다(예: BrowseComp에서 29.5% 대 15.3%). 또한, 광범위한 사전 훈련, SFT 및 강화 학습을 통해 훈련된 Tongyi DeepResearch와 같은 산업계 경쟁자조차 BrowseComp-ZH에서 OpenSeeker에 비해 낮은 성능을 보였습니다(48.4% 대 46.7%). 우리는 완전한 훈련 데이터셋과 모델 가중치를 완전히 공개하여 최첨단 검색 에이전트 연구의 민주화를 실현하고 더욱 투명하고 협력적인 생태계를 조성하고자 합니다.

Original Abstract

Deep search capabilities have become an indispensable competency for frontier Large Language Model (LLM) agents, yet the development of high-performance search agents remains dominated by industrial giants due to a lack of transparent, high-quality training data. This persistent data scarcity has fundamentally hindered the progress of the broader research community in developing and innovating within this domain. To bridge this gap, we introduce OpenSeeker, the first fully open-source search agent (i.e., model and data) that achieves frontier-level performance through two core technical innovations: (1) Fact-grounded scalable controllable QA synthesis, which reverse-engineers the web graph via topological expansion and entity obfuscation to generate complex, multi-hop reasoning tasks with controllable coverage and complexity. (2) Denoised trajectory synthesis, which employs a retrospective summarization mechanism to denoise the trajectory, therefore promoting the teacher LLMs to generate high-quality actions. Experimental results demonstrate that OpenSeeker, trained (a single training run) on only 11.7k synthesized samples, achieves state-of-the-art performance across multiple benchmarks including BrowseComp, BrowseComp-ZH, xbench-DeepSearch, and WideSearch. Notably, trained with simple SFT, OpenSeeker significantly outperforms the second-best fully open-source agent DeepDive (e.g., 29.5% v.s. 15.3% on BrowseComp), and even surpasses industrial competitors such as Tongyi DeepResearch (trained via extensive continual pre-training, SFT, and RL) on BrowseComp-ZH (48.4% v.s. 46.7%). We fully open-source the complete training dataset and the model weights to democratize frontier search agent research and foster a more transparent, collaborative ecosystem.

3 Citations

1 Influential

4.5 Altmetric

27.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!