2603.09341v1 Mar 10, 2026 cs.CL

TaSR-RAG: 분류 체계 기반의 구조화된 추론을 통한 검색 증강 생성

TaSR-RAG: Taxonomy-guided Structured Reasoning for Retrieval-Augmented Generation

Jiawei Han

Citations: 56

h-index: 4

Jiashuo Sun

DAMO Academy, Alibaba Group

Citations: 997

h-index: 10

Yixuan Xie

Citations: 5

h-index: 1

Jimeng Shi

Citations: 247

h-index: 8

Shaowen Wang

Citations: 43

h-index: 3

검색 증강 생성(RAG)은 외부 증거를 활용하여 대규모 언어 모델(LLM)이 지식 집약적이고 시간 민감한 질문에 답변하도록 돕습니다. 그러나 대부분의 RAG 시스템은 여전히 비정형 텍스트 조각을 검색하고 단일 단계 생성을 사용하는데, 이는 종종 중복된 컨텍스트, 낮은 정보 밀도 및 불안정한 다단계 추론을 야기합니다. 구조화된 RAG 파이프라인은 정보의 정확성을 높일 수 있지만, 일반적으로 비용이 많이 들고 오류가 발생하기 쉬운 그래프 구축이 필요하거나, 쿼리의 추론 흐름과 일치하지 않는 엄격한 개체 중심 구조를 강제합니다. 저희는 증거 선택을 위한 분류 체계 기반의 구조화된 추론 프레임워크인 TaSR-RAG를 제안합니다. 저희는 쿼리와 문서를 관계 트리플로 표현하고, 일반화 능력과 정확성 사이의 균형을 맞추기 위해 가벼운 두 단계 분류 체계를 사용하여 개체 의미를 제약합니다. 복잡한 질문이 주어지면, TaSR-RAG는 명시적인 잠재 변수를 가진 정렬된 트리플 서브 쿼리 시퀀스로 분해하고, 원시 트리플 간의 의미 유사성과 유형화된 트리플 간의 구조적 일관성을 결합한 하이브리드 트리플 매칭을 통해 단계별로 증거를 선택합니다. TaSR-RAG는 명시적인 개체 바인딩 테이블을 사용하여 단계별로 중간 변수를 해결하고, 명시적인 그래프 구축이나 광범위한 검색 없이 개체 혼동을 줄입니다. 여러 다단계 질문 답변 벤치마크에서 수행된 실험 결과, TaSR-RAG는 강력한 RAG 및 구조화된 RAG 기반 모델보다 최대 14% 더 높은 성능을 보이며, 더 명확한 증거 출처를 제공하고 더욱 신뢰할 수 있는 추론 과정을 보여줍니다.

Original Abstract

Retrieval-Augmented Generation (RAG) helps large language models (LLMs) answer knowledge-intensive and time-sensitive questions by conditioning generation on external evidence. However, most RAG systems still retrieve unstructured chunks and rely on one-shot generation, which often yields redundant context, low information density, and brittle multi-hop reasoning. While structured RAG pipelines can improve grounding, they typically require costly and error-prone graph construction or impose rigid entity-centric structures that do not align with the query's reasoning chain. We propose \textsc{TaSR-RAG}, a taxonomy-guided structured reasoning framework for evidence selection. We represent both queries and documents as relational triples, and constrain entity semantics with a lightweight two-level taxonomy to balance generalization and precision. Given a complex question, \textsc{TaSR-RAG} decomposes it into an ordered sequence of triple sub-queries with explicit latent variables, then performs step-wise evidence selection via hybrid triple matching that combines semantic similarity over raw triples with structural consistency over typed triples. By maintaining an explicit entity binding table across steps, \textsc{TaSR-RAG} resolves intermediate variables and reduces entity conflation without explicit graph construction or exhaustive search. Experiments on multiple multi-hop question answering benchmarks show that \textsc{TaSR-RAG} consistently outperforms strong RAG and structured-RAG baselines by up to 14\%, while producing clearer evidence attribution and more faithful reasoning traces.

1 Citations

0 Influential

5 Altmetric

26.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!