2603.29085v1 Mar 30, 2026 cs.AI

PAR$^2$-RAG: 계획적 능동 검색 및 추론을 통한 다중 단계 질의 응답

PAR$^2$-RAG: Planned Active Retrieval and Reasoning for Multi-Hop Question Answering

Xingyu Li

Citations: 7

h-index: 1

Rong Wang

Citations: 1

h-index: 1

Yuying Wang

Citations: 2

h-index: 1

Meng-Hao Guo

Citations: 151

h-index: 3

Chenyang Li

Citations: 7

h-index: 1

Tao Sheng

Citations: 12

h-index: 2

Sujith Ravi

Citations: 16

h-index: 2

Dan Roth

Citations: 13

h-index: 2

대규모 언어 모델(LLM)은 여러 단계를 거쳐 증거를 검색하고 추론해야 하는 다중 단계 질의 응답(MHQA)에서 여전히 취약성을 보입니다. 반복적인 검색 시스템은 초기 단계에서 낮은 검색률에 갇혀 오류를 증폭시킬 수 있으며, 계획만 사용하는 방식은 중간 단계의 증거가 변경될 때 적응할 수 없는 정적인 질의 집합을 생성할 수 있습니다. 본 논문에서는 계획적 능동 검색 및 추론 RAG (PAR$^2$-RAG)이라는 두 단계 프레임워크를 제안합니다. PAR$^2$-RAG는 *커버리지(coverage)*와 *커밋먼트(commitment)*를 분리합니다. PAR$^2$-RAG는 먼저 폭넓은 탐색을 통해 높은 검색률을 갖는 증거 기반을 구축한 다음, 반복적인 루프에서 증거의 충분성을 제어하면서 깊이 우선 탐색을 수행합니다. 네 가지 MHQA 벤치마크에서 PAR$^2$-RAG는 기존의 최첨단 모델보다 일관되게 우수한 성능을 보였으며, IRCoT와 비교했을 때 최대 23.5% 더 높은 정확도를 달성했으며, NDCG 측면에서는 최대 10.5%의 검색 성능 향상을 보였습니다.

Original Abstract

Large language models (LLMs) remain brittle on multi-hop question answering (MHQA), where answering requires combining evidence across documents through retrieval and reasoning. Iterative retrieval systems can fail by locking onto an early low-recall trajectory and amplifying downstream errors, while planning-only approaches may produce static query sets that cannot adapt when intermediate evidence changes. We propose \textbf{Planned Active Retrieval and Reasoning RAG (PAR$^2$-RAG)}, a two-stage framework that separates \emph{coverage} from \emph{commitment}. PAR$^2$-RAG first performs breadth-first anchoring to build a high-recall evidence frontier, then applies depth-first refinement with evidence sufficiency control in an iterative loop. Across four MHQA benchmarks, PAR$^2$-RAG consistently outperforms existing state-of-the-art baselines, compared with IRCoT, PAR$^2$-RAG achieves up to \textbf{23.5\%} higher accuracy, with retrieval gains of up to \textbf{10.5\%} in NDCG.

1 Citations

0 Influential

1.5 Altmetric

8.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!