2601.04764v1 Jan 08, 2026 cs.AI

Orion-RAG: 비그래프 데이터를 위한 경로 정렬 하이브리드 검색

Orion-RAG: Path-Aligned Hybrid Retrieval for Graphless Data

Zhen Chen

Citations: 15

h-index: 2

Weihao Xie

Citations: 1

h-index: 1

Peilin Chen

Citations: 284

h-index: 10

Shiqi Wang

Citations: 8

h-index: 2

Jianping Wang

Citations: 107

h-index: 3

검색 증강 생성(RAG)은 지식 종합에 효과적인 것으로 입증되었으나, 데이터가 본질적으로 이산적이고 파편화된 실제 시나리오에서는 상당한 어려움에 직면한다. 대부분의 환경에서 정보는 명시적인 연결 고리가 없는 보고서나 로그 같은 고립된 파일들에 분산되어 있다. 기존 검색 엔진들은 파일 간의 연결 관계를 무시하고 각 파일을 독립적으로 처리한다. 게다가 이러한 방대한 데이터에 대해 수동으로 지식 그래프를 구축하는 것은 비현실적이다. 이러한 격차를 해소하기 위해 우리는 Orion-RAG를 제안한다. 우리의 핵심 통찰은 단순하지만 효과적이다. 즉, 이 데이터를 구조화하기 위해 무거운 알고리즘이 필요하지 않다는 것이다. 대신, 우리는 관련된 개념을 자연스럽게 연결하는 경량 경로를 추출하기 위해 낮은 복잡도의 전략을 사용한다. 우리는 이러한 간소화된 접근 방식만으로도 파편화된 문서를 반구조화된 데이터로 변환하기에 충분하며, 이를 통해 시스템이 서로 다른 파일 간의 정보를 효과적으로 연결할 수 있음을 보여준다. 광범위한 실험을 통해 Orion-RAG가 다양한 도메인에서 주류 프레임워크보다 일관되게 우수한 성능을 보이며, 높은 비용 효율성으로 실시간 업데이트와 명시적인 인간 개입(Human-in-the-Loop) 검증을 지원함을 입증한다. FinanceBench에서의 실험 결과, 강력한 베이스라인 대비 25.2%의 상대적 향상을 보이며 우수한 정밀도를 입증했다.

Original Abstract

Retrieval-Augmented Generation (RAG) has proven effective for knowledge synthesis, yet it encounters significant challenges in practical scenarios where data is inherently discrete and fragmented. In most environments, information is distributed across isolated files like reports and logs that lack explicit links. Standard search engines process files independently, ignoring the connections between them. Furthermore, manually building Knowledge Graphs is impractical for such vast data. To bridge this gap, we present Orion-RAG. Our core insight is simple yet effective: we do not need heavy algorithms to organize this data. Instead, we use a low-complexity strategy to extract lightweight paths that naturally link related concepts. We demonstrate that this streamlined approach suffices to transform fragmented documents into semi-structured data, enabling the system to link information across different files effectively. Extensive experiments demonstrate that Orion-RAG consistently outperforms mainstream frameworks across diverse domains, supporting real-time updates and explicit Human-in-the-Loop verification with high cost-efficiency. Experiments on FinanceBench demonstrate superior precision with a 25.2% relative improvement over strong baselines.

0 Citations

0 Influential

5 Altmetric

25.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!