2603.21440v1 Mar 22, 2026 cs.CL

KG-Hopper: 강화 학습을 통한 지식 그래프 추론으로 소형 오픈 LLM의 성능 향상

KG-Hopper: Empowering Compact Open LLMs with Knowledge Graph Reasoning via Reinforcement Learning

Citations: 37

h-index: 3

Citations: 124

h-index: 4

대규모 언어 모델(LLM)은 뛰어난 자연어 처리 능력을 보여주지만, 지식 집약적인 추론 작업에서는 어려움을 겪는 경우가 많습니다. 지식 그래프(KG)를 활용하는 지식 기반 질의응답(KBQA)은 정확한 다중 단계 추론이 필요하기 때문에 이러한 어려움을 잘 보여주는 예시입니다. 기존 접근 방식은 일반적으로 미리 정의된 파이프라인에 따라 순차적인 추론 단계를 수행하며, 이는 유연성을 제한하고 각 단계에서의 독립적인 추론으로 인해 오류가 연쇄적으로 발생하는 문제를 야기합니다. 이러한 한계를 극복하기 위해, 우리는 강화 학습(RL) 프레임워크인 KG-Hopper를 제안합니다. KG-Hopper는 소형 오픈 LLM에게 단일 추론 단계 내에서 통합된 다중 단계 KG 추론 능력을 부여합니다. 기존의 단계별 추론 방식과는 달리, 우리는 전체 KG 탐색 및 의사 결정 과정을 통합된 '사고' 단계에 포함하는 추론 LLM을 학습시킵니다. 이를 통해 단계 간의 의존성을 고려한 전반적인 추론과 역추적을 통한 동적 경로 탐색이 가능합니다. 8개의 KG 추론 벤치마크에 대한 실험 결과, 7B 파라미터의 LLM을 기반으로 하는 KG-Hopper는 70B까지의 더 큰 다단계 시스템보다 우수한 성능을 보이며, GPT-3.5-Turbo 및 GPT-4o-mini와 같은 독점 모델과 경쟁력 있는 성능을 달성합니다. 또한 KG-Hopper는 크기가 작고, 개방형이며, 데이터 효율성이 높습니다. 코드 공개: https://github.com/Wangshuaiia/KG-Hopper.

Original Abstract

Large Language Models (LLMs) demonstrate impressive natural language capabilities but often struggle with knowledge-intensive reasoning tasks. Knowledge Base Question Answering (KBQA), which leverages structured Knowledge Graphs (KGs) exemplifies this challenge due to the need for accurate multi-hop reasoning. Existing approaches typically perform sequential reasoning steps guided by predefined pipelines, restricting flexibility and causing error cascades due to isolated reasoning at each step. To address these limitations, we propose KG-Hopper, a novel Reinforcement Learning (RL) framework that empowers compact open LLMs with the ability to perform integrated multi-hop KG reasoning within a single inference round. Rather than reasoning step-by-step, we train a Reasoning LLM that embeds the entire KG traversal and decision process into a unified ``thinking'' stage, enabling global reasoning over cross-step dependencies and dynamic path exploration with backtracking. Experimental results on eight KG reasoning benchmarks show that KG-Hopper, based on a 7B-parameter LLM, consistently outperforms larger multi-step systems (up to 70B) and achieves competitive performance with proprietary models such as GPT-3.5-Turbo and GPT-4o-mini, while remaining compact, open, and data-efficient. The code is publicly available at: https://github.com/Wangshuaiia/KG-Hopper.

0 Citations

0 Influential

22 Altmetric

110.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!