2602.07333v1 Feb 07, 2026 cs.IR

강화 학습을 통한 다양한 소스 기반의 고품질 텍스트 사용자 표현

High Fidelity Textual User Representation over Heterogeneous Sources via Reinforcement Learning

Jingwei Wu

Citations: 475

h-index: 6

Benjamin Le

Citations: 0

h-index: 0

Jianqiang Shen

Citations: 35

h-index: 4

Wenjing Zhang

Citations: 23

h-index: 3

Rajat Arora

Citations: 6

h-index: 1

Ye Tao

Citations: 41

h-index: 4

Ping Liu

Citations: 74

h-index: 5

Muchen Wu

Citations: 5

h-index: 1

Qianqi Shen

Citations: 85

h-index: 5

Fedor Borisyuk

Citations: 453

h-index: 5

대규모 채용 플랫폼에서 효과적인 개인화는 프로필, 전문 데이터 및 검색 활동 로그를 포함한 다양한 텍스트 소스를 기반으로 사용자를 모델링하는 것을 필요로 합니다. 추천 시스템이 점점 더 대규모 언어 모델(LLM)을 채택함에 따라, 특히 지연 시간에 민감한 온라인 환경에서 다양한 소스로부터 통일되고 해석 가능하며 간결한 표현을 생성하는 것이 중요해졌습니다. 본 연구에서는 각 사용자에 대한 통일된 텍스트 표현을 생성하기 위한 새로운 강화 학습(RL) 프레임워크를 제안합니다. 제안하는 방법은 클릭, 지원 등과 같은 암묵적인 사용자 참여 신호를 주요 보상으로 활용하여 중요한 정보를 추출합니다. 또한, 프레임워크는 형식 및 길이 제약 조건을 적용하는 규칙 기반 보상을 통해 보완됩니다. 전 세계 최대 규모의 채용 플랫폼 중 하나인 LinkedIn의 다양한 제품에 대한 광범위한 오프라인 실험을 통해 주요 다운스트림 비즈니스 지표에서 상당한 개선을 보였습니다. 본 연구는 LLM 기반 시스템과 직접적으로 호환되는 해석 가능한 사용자 표현을 구축하기 위한 실용적이고, 라벨링이 필요 없으며, 확장 가능한 솔루션을 제공합니다.

Original Abstract

Effective personalization on large-scale job platforms requires modeling members based on heterogeneous textual sources, including profiles, professional data, and search activity logs. As recommender systems increasingly adopt Large Language Models (LLMs), creating unified, interpretable, and concise representations from heterogeneous sources becomes critical, especially for latency-sensitive online environments. In this work, we propose a novel Reinforcement Learning (RL) framework to synthesize a unified textual representation for each member. Our approach leverages implicit user engagement signals (e.g., clicks, applies) as the primary reward to distill salient information. Additionally, the framework is complemented by rule-based rewards that enforce formatting and length constraints. Extensive offline experiments across multiple LinkedIn products, one of the world's largest job platforms, demonstrate significant improvements in key downstream business metrics. This work provides a practical, labeling-free, and scalable solution for constructing interpretable user representations that are directly compatible with LLM-based systems.

0 Citations

0 Influential

3 Altmetric

15.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!