2604.12487v1 Apr 14, 2026 cs.CL

KG-Reasoner: 엔드-투-엔드 멀티-홉 지식 그래프 추론을 위한 강화 학습 기반 모델

KG-Reasoner: A Reinforced Model for End-to-End Multi-Hop Knowledge Graph Reasoning

Citations: 2,668

h-index: 21

Citations: 132

h-index: 4

대규모 언어 모델(LLM)은 자연어 이해 및 생성 능력에서 뛰어난 성능을 보이지만, 지식 기반 추론에는 어려움을 겪습니다. 구조화된 지식 그래프(KG)는 외부 지식 표현의 효과적인 형태로, 전통적인 지식 기반 질의 응답(KBQA) 작업의 성능 향상에 널리 사용됩니다. 그러나 복잡한 질의에 대한 정확한 멀티-홉 추론을 수행하는 것은 여전히 매우 어려운 과제입니다. 대부분의 기존 방법은 추론 과정을 고정된 파이프라인을 통해 실행되는 일련의 독립적인 단계로 분해합니다. 이러한 방식은 어느 정도 효과적이지만, 추론의 유연성을 제한하고 전체 의사 결정 과정을 단편화하여, 종종 일관성이 부족하고 초기 단계에서 얻은 중요한 중간 정보가 손실되는 결과를 초래합니다. 본 논문에서는 멀티-홉 추론을 추론 LLM의 통합된 "사고" 단계에 통합하는 엔드-투-엔드 프레임워크인 KG-Reasoner를 소개합니다. 강화 학습(RL)을 통해 LLM은 지식 그래프 탐색 과정을 학습하여, 추론 경로를 동적으로 탐색하고 필요한 경우 백트래킹을 수행할 수 있도록 합니다. 8가지 멀티-홉 및 지식 기반 추론 벤치마크에 대한 실험 결과, KG-Reasoner는 최첨단 방법과 비교하여 경쟁력 있는 또는 더 우수한 성능을 달성하는 것으로 나타났습니다. 코드 및 자료는 다음 저장소에서 확인할 수 있습니다: https://github.com/Wangshuaiia/KG-Reasoner.

Original Abstract

Large Language Models (LLMs) exhibit strong abilities in natural language understanding and generation, yet they struggle with knowledge-intensive reasoning. Structured Knowledge Graphs (KGs) provide an effective form of external knowledge representation and have been widely used to enhance performance in classical Knowledge Base Question Answering (KBQA) tasks. However, performing precise multi-hop reasoning over KGs for complex queries remains highly challenging. Most existing approaches decompose the reasoning process into a sequence of isolated steps executed through a fixed pipeline. While effective to some extent, such designs constrain reasoning flexibility and fragment the overall decision process, often leading to incoherence and the loss of critical intermediate information from earlier steps. In this paper, we introduce KG-Reasoner, an end-to-end framework that integrates multi-step reasoning into a unified "thinking" phase of a Reasoning LLM. Through Reinforcement Learning (RL), the LLM is trained to internalize the KG traversal process, enabling it to dynamically explore reasoning paths, and perform backtracking when necessary. Experiments on eight multi-hop and knowledge-intensive reasoning benchmarks demonstrate that KG-Reasoner achieves competitive or superior performance compared to the state-of-the-art methods. Codes are available at the repository: https://github.com/Wangshuaiia/KG-Reasoner.

0 Citations

0 Influential

30.5 Altmetric

152.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!