2604.27221v1 Apr 29, 2026 cs.AI

Web2BigTable: 인터넷 규모 정보 검색 및 추출을 위한 양층 멀티 에이전트 LLM 시스템

Web2BigTable: A Bi-Level Multi-Agent LLM System for Internet-Scale Information Search and Extraction

Yihang Chen

Citations: 215

h-index: 3

Huichi Zhou

Citations: 16

h-index: 3

Zhiyuan He

Citations: 47

h-index: 3

Yuxuan Huang

Citations: 174

h-index: 3

Weilin Luo

Citations: 24

h-index: 3

Meng Fang

Citations: 228

h-index: 3

Jun Wang

Citations: 154

h-index: 5

Yuxiang Chen

Citations: 26

h-index: 1

Kaylen Lee

Citations: 0

h-index: 0

에이전트 기반 웹 검색은 점점 더 두 가지 뚜렷한 요구 사항에 직면하고 있습니다. 즉, 단일 대상에 대한 심층적인 추론과, 다양한 개체 및 이기종 소스를 포괄하는 구조화된 통합입니다. 현재 시스템은 이 두 가지 측면 모두에서 어려움을 겪고 있습니다. 폭넓은 범위를 목표로 하는 작업은 넓은 범위의 정보와 개체 간 일관성을 갖춘 스키마 기반 출력을 요구하는 반면, 심층적인 추론을 목표로 하는 작업은 복잡하고 분기형 검색 경로에 대한 일관성 있는 추론을 필요로 합니다. 본 논문에서는 웹에서 테이블 형태로 정보를 검색하는 멀티 에이전트 프레임워크인 **Web2BigTable**을 소개합니다. Web2BigTable은 양층 아키텍처를 채택하며, 상위 레벨의 오케스트레이터가 작업을 하위 문제로 분해하고, 하위 레벨의 작업 에이전트들이 이 문제들을 병렬적으로 해결합니다. 이 프레임워크는 폐쇄 루프의 실행-검증-반성 과정을 통해, 지속적인 인간이 읽을 수 있는 외부 메모리를 활용하여 분해 및 실행 과정을 지속적으로 개선하고, 각 개별 에이전트의 자체적인 업데이트를 수행합니다. 실행 과정에서 작업 에이전트들은 공유 작업 공간을 통해 부분적인 결과를 공유하며, 이를 통해 중복 탐색을 줄이고, 상충되는 증거를 조정하며, 새롭게 나타나는 정보 격차에 적응할 수 있습니다. Web2BigTable은 WideSearch 데이터셋에서 Avg@4 성공률 **38.50%** (두 번째 최고 성능인 5.10%의 7.5배), 행 F1 점수 **63.53** (+25.03 포인트), 항목 F1 점수 **80.12** (+14.42 포인트)를 달성하며 새로운 최고 성능을 기록했습니다. 또한, XBench-DeepSearch 데이터셋에 대한 심층 검색에서도 73.0%의 정확도를 달성하며 일반화 성능을 입증했습니다. 코드 및 관련 자료는 https://github.com/web2bigtable/web2bigtable 에서 확인할 수 있습니다.

Original Abstract

Agentic web search increasingly faces two distinct demands: deep reasoning over a single target, and structured aggregation across many entities and heterogeneous sources. Current systems struggle on both fronts. Breadth-oriented tasks demand schema-aligned outputs with wide coverage and cross-entity consistency, while depth-oriented tasks require coherent reasoning over long, branching search trajectories. We introduce \textbf{Web2BigTable}, a multi-agent framework for web-to-table search that supports both regimes. Web2BigTable adopts a bi-level architecture in which an upper-level orchestrator decomposes the task into sub-problems and lower-level worker agents solve them in parallel. Through a closed-loop run--verify--reflect process, the framework jointly improves decomposition and execution over time via persistent, human-readable external memory, with self-evolving updates to each single-agent. During execution, workers coordinate through a shared workspace that makes partial findings visible, allowing them to reduce redundant exploration, reconcile conflicting evidence, and adapt to emerging coverage gaps. Web2BigTable sets a new state of the art on WideSearch, reaching an Avg@4 Success Rate of \textbf{38.50} ($7.5\times$ the second best at 5.10), Row F1 of \textbf{63.53} (+25.03 over the second best), and Item F1 of \textbf{80.12} (+14.42 over the second best). It also generalises to depth-oriented search on XBench-DeepSearch, achieving 73.0 accuracy. Code is available at https://github.com/web2bigtable/web2bigtable.

0 Citations

0 Influential

22.5 Altmetric

112.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!