2601.05588v4 Jan 09, 2026 cs.IR

자기회귀 순위 모델: 듀얼 인코더와 크로스 인코더 간의 간극을 해소하다

Autoregressive Ranking: Bridging the Gap Between Dual and Cross Encoders

Benjamin Rozonoyer

Citations: 22

h-index: 2

Chong You

Citations: 3,263

h-index: 4

Michael Boratko

Citations: 1,027

h-index: 14

Himanshu Jain

Citations: 12

h-index: 1

Nilesh Gupta

Citations: 55

h-index: 4

Srinadh Bhojanapalli

Citations: 11,914

h-index: 32

Andrew McCallum

Citations: 10

h-index: 1

Felix X. Yu

Citations: 181

h-index: 4

대규모 언어 모델(LLM)의 성공은 검색 및 순위 결정 분야에서 생성적 접근 방식으로의 전환을 촉진했으며, 이는 기존의 듀얼 인코더(DE) 및 크로스 인코더(CE)를 대체하는 것을 목표로 합니다. 대표적인 방법론 중 하나는 포인트별 자기회귀 순위(ARR) 모델로, LLM이 빔 서치를 통해 순위를 결정할 수 있도록 문서 식별자(docID)를 토큰 단위로 생성합니다. ARR은 DE에 비해 더 뛰어난 표현력을 제공하면서도 CE의 엄청난 계산 비용을 피할 수 있다는 장점이 있습니다. 그러나 이러한 표현력에 대한 공식적인 이론적 기반은 아직 존재하지 않았습니다. 또한, 표준적인 다음 토큰 예측 손실 함수는 순위 정보에 민감하지 않으며, LLM을 순위 결정 작업에 미세 조정하는 데 적합하지 않습니다. 본 논문에서는 먼저 ARR의 표현 능력이 DE보다 엄격하게 우수함을 증명합니다. DE는 임의의 순위를 달성하기 위해 코퍼스 크기에 따라 선형적으로 증가하는 임베딩 차원이 필요하지만, ARR은 고정된 은닉 차원을 사용하여 이를 해결할 수 있습니다. 또한, LLM 미세 조정을 위한 일반화된 순위 민감 학습 손실 함수인 SToICaL (Simple Token-Item Calibrated Loss)을 제안합니다. 아이템 수준의 재가중치 부여 및 접두사 트리 마진화를 사용하여, LLM이 생성하는 유효한 docID 토큰에 대한 확률 질량을 실제 관련성에 따라 분산시킵니다. WordNet 및 ESCI 데이터 세트에서의 실험 결과, 제안하는 손실 함수는 잘못된 docID 생성을 억제하고, 상위 1개 결과 외에도 순위 지표를 크게 향상시키는 것을 확인했습니다.

Original Abstract

The success of Large Language Models (LLMs) has motivated a shift toward generative approaches to retrieval and ranking, aiming to supersede classical Dual Encoders (DEs) and Cross Encoders (CEs). A prominent paradigm is pointwise Autoregressive Ranking (ARR), where an LLM generates document identifiers (docIDs) token-by-token to enable ranking via beam search. ARR offers the promise of superior expressivity compared to DEs while avoiding the prohibitive computational cost of CEs. However, a formal theoretical foundation for this expressive power has been missing. Moreover, the standard next-token prediction loss is rank-agnostic and inappropriate for finetuning an LLM for ranking tasks. In this paper, we first prove that the expressive capacity of ARR is strictly superior to DEs. While a DE requires an embedding dimension that grows linearly with corpus size to achieve arbitrary rankings, ARR can solve it with a constant hidden dimension. We then propose SToICaL (Simple Token-Item Calibrated Loss), a generalized rank-aware training loss for LLM finetuning. By using item-level reweighting and prefix-tree marginalization, we distribute probability mass over valid docID tokens based on their ground-truth relevance. Experiments on WordNet and ESCI datasets verify that our loss suppresses invalid docID generations and significantly improves ranking metrics beyond top-1 retrieval.

1 Citations

0 Influential

16 Altmetric

81.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!