2603.22779v1 Mar 24, 2026 cs.IR

KARMA: 지식-행동 정규화 기반 다중 모드 정렬을 통한 타오바오 개인화 검색

KARMA: Knowledge-Action Regularized Multimodal Alignment for Personalized Search at Taobao

Zhicong Sun

Citations: 6

h-index: 2

Dan Ou

Citations: 22

h-index: 2

Haihong Tang

Citations: 52

h-index: 4

Wenming Zhang

Citations: 92

h-index: 2

Yinwei Wei

Citations: 37

h-index: 1

Liren Yu

Citations: 10

h-index: 1

Zhixuan Zhang

Citations: 16

h-index: 2

대규모 언어 모델(LLM)은 풍부한 의미 정보를 담고 있어 개인화 검색 시스템에 의미 일반화 기능을 부여하는 데 자연스러운 선택입니다. 그러나 실제로는 LLM을 산업용 개인화 작업(예: 다음 항목 예측)에 직접 미세 조정하면 최적의 결과를 얻기 어렵다는 점을 발견했습니다. 이는 지식-행동 간의 중요한 격차 때문이며, 사전 훈련된 의미 지식을 유지하는 것과 판별적 목표에 따른 특정 개인화된 행동에 맞추는 것 사이의 내재적인 충돌을 의미합니다. 경험적으로, 행동에만 초점을 맞춘 학습 목표는 어텐션 '싱크'와 같은 의미 붕괴를 유발합니다. 이러한 성능 저하는 LLM의 일반화 능력을 심각하게 저해하여 개인화 검색 시스템의 개선으로 이어지지 않습니다. 저희는 KARMA (지식-행동 정규화 기반 다중 모드 정렬)라는 통합 프레임워크를 제안합니다. KARMA는 의미 복원을 학습 단계에서만 사용되는 정규화기로 취급합니다. KARMA는 검색(행동)을 위한 다음 관심 항목 임베딩을 최적화하는 동시에 두 가지 상호 보완적인 목표를 통해 의미적 복원 가능성(지식)을 유지합니다. 첫째, 히스토리 정보를 기반으로 한 의미 생성은 LLM의 기본 다음 토큰 분포에 최적화를 고정합니다. 둘째, 임베딩 정보를 기반으로 한 의미 복원은 관심 항목 임베딩이 의미적으로 복원 가능하도록 제약합니다. 타오바오 검색 시스템에서 KARMA는 의미 붕괴를 완화하고(어텐션 싱크 분석), 행동 지표와 의미 충실도를 모두 향상시킵니다. 실험 결과, 의미 복원 가능성은 최대 +22.5%의 HR@200을 향상시킵니다. KARMA를 통해 순위 결정에서 +0.25 CTR AUC, 사전 순위 결정에서 +1.86 HR, 그리고 검색에서 +2.51 HR을 달성했습니다. 순위 결정 단계에서 낮은 추론 오버헤드로 온라인에 배포된 KARMA는 항목 클릭률을 +0.5% 증가시켰습니다.

Original Abstract

Large Language Models (LLMs) are equipped with profound semantic knowledge, making them a natural choice for injecting semantic generalization into personalized search systems. However, in practice we find that directly fine-tuning LLMs on industrial personalized tasks (e.g. next item prediction) often yields suboptimal results. We attribute this bottleneck to a critical Knowledge--Action Gap: the inherent conflict between preserving pre-trained semantic knowledge and aligning with specific personalized actions by discriminative objectives. Empirically, action-only training objectives induce Semantic Collapse, such as attention ``sinks''. This degradation severely cripples the LLM's generalization, failing to bring improvements to personalized search systems. We propose KARMA (Knowledge--Action Regularized Multimodal Alignment), a unified framework that treats semantic reconstruction as a train-only regularizer. KARMA optimizes a next-interest embedding for retrieval (Action) while enforcing semantic decodability (Knowledge) through two complementary objectives: (i) history-conditioned semantic generation, which anchors optimization to the LLM's native next-token distribution, and (ii) embedding-conditioned semantic reconstruction, which constrains the interest embedding to remain semantically recoverable. On Taobao search system, KARMA mitigates semantic collapse (attention-sink analysis) and improves both action metrics and semantic fidelity. In ablations, semantic decodability yields up to +22.5 HR@200. With KARMA, we achieve +0.25 CTR AUC in ranking, +1.86 HR in pre-ranking and +2.51 HR in recalling. Deployed online with low inference overhead at ranking stage, KARMA drives +0.5% increase in Item Click.

0 Citations

0 Influential

2 Altmetric

10.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!