2601.09028v2 Jan 13, 2026 cs.CL

OpenDecoder: 문서 품질을 통합하기 위한 개방형 대규모 언어 모델 디코딩, RAG 시스템에 적용

OpenDecoder: Open Large Language Model Decoding to Incorporate Document Quality in RAG

Fengran Mo

University of Montreal

Citations: 985

h-index: 18

Jian-Yun Nie

Citations: 453

h-index: 11

Zhan Su

Citations: 42

h-index: 4

Yuchen Hui

Citations: 1,087

h-index: 4

Jinghan Zhang

Citations: 10

h-index: 1

J. Sun

Citations: 21

h-index: 2

Zheyuan Liu

Citations: 33

h-index: 3

Chao Zhang

Citations: 487

h-index: 9

Tetsuya Sakai

Citations: 61

h-index: 3

대규모 언어 모델(LLM)의 발전은 다양한 하위 작업에서 뛰어난 성능을 보여주었으며, LLM 기반 검색 증강 생성(RAG)도 그 예입니다. 생성된 콘텐츠의 품질은 검색된 정보의 유용성과 LLM의 내부 정보 처리 메커니즘이 이를 답변 생성에 통합하는 능력에 크게 의존합니다. 일반적으로 검색된 정보는 질문과 관련이 있다고 가정하지만, 검색된 정보는 질문과 문서 컬렉션에 따라 다양한 수준의 관련성과 유용성을 가질 수 있습니다. 답변 생성 시 검색된 정보의 관련성을 고려하는 것이 중요합니다. 본 논문에서는 검색된 정보의 명시적인 평가를 활용하여 생성 과정에서 품질 지표 특징으로 사용하는 새로운 접근 방식인 OpenDecoder를 제안합니다. 우리는 다양한 수준의 노이즈가 있는 컨텍스트에 더 강건한 RAG 모델을 구축하는 것을 목표로 합니다. 관련성 점수, 순위 점수 및 QPP(쿼리 성능 예측) 점수, 세 가지 유형의 명시적인 평가 정보를 고려합니다. 5개의 벤치마크 데이터 세트에 대한 실험 결과는 OpenDecoder가 다양한 기준 방법보다 효과적이고 더 강건하다는 것을 보여줍니다. 또한, 본 패러다임은 LLM의 추가 학습에 적용될 수 있으며, 다양한 유형의 외부 지표와 통합될 수 있다는 장점이 있습니다.

Original Abstract

The development of large language models (LLMs) has achieved superior performance in a range of downstream tasks, including LLM-based retrieval-augmented generation (RAG). The quality of generated content heavily relies on the usefulness of the retrieved information and the capacity of LLMs' internal information processing mechanism to incorporate it in answer generation. It is generally assumed that the retrieved information is relevant to the question. However, the retrieved information may have a variable degree of relevance and usefulness, depending on the question and the document collection. It is important to take into account the relevance of the retrieved information in answer generation. In this paper, we propose OpenDecoder, a new approach that leverages explicit evaluation of the retrieved information as quality indicator features for generation. We aim to build a RAG model that is more robust to varying levels of noisy context. Three types of explicit evaluation information are considered: relevance score, ranking score, and QPP (query performance prediction) score. The experimental results on five benchmark datasets demonstrate the effectiveness and better robustness of OpenDecoder by outperforming various baseline methods. Importantly, this paradigm is flexible to be integrated with the post-training of LLMs for any purposes and incorporated with any type of external indicators.

10 Citations

0 Influential

9 Altmetric

55.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!