2605.05833v1 May 07, 2026 cs.AI

자동 입찰에서의 언어 표현의 역할: 연구 결과 및 시사점

On the Role of Language Representations in Auto-Bidding: Findings and Implications

Ronghao Chen

Citations: 130

h-index: 6

Huacan Wang

Citations: 82

h-index: 5

Hanwen Du

Citations: 103

h-index: 5

Guanyu Zhu

Citations: 70

h-index: 5

Hongji Li

Citations: 18

h-index: 3

Xinyu Fang

Citations: 40

h-index: 4

Yongxin Ni

Citations: 8

h-index: 2

Youhua Li

Citations: 222

h-index: 5

Jining Luan

Citations: 0

h-index: 0

Sibo Xu

Citations: 0

h-index: 0

Ersheng Ni

Citations: 23

h-index: 2

Jincheng Fang

Citations: 68

h-index: 4

Yiqi Sun

Citations: 38

h-index: 3

Xuan Lan

Citations: 26

h-index: 2

자동 입찰은 실시간 광고 시장에서 중요한 과제로, 예산 및 CPA와 같은 제약 조건 하에서 장기적인 가치를 최적화해야 합니다. 기존 자동 입찰 방법은 간결한 숫자 상태 표현에 의존하는데, 이는 암묵적으로 배송 동역학을 포착할 수 있지만, 실제 캠페인에서 고수준 의도, 변화하는 피드백, 운영자 스타일의 전략적 지침을 명시적으로 표현하고 제어하는 데는 한계가 있습니다. 반면, 대규모 언어 모델(LLM)은 의미 정보를 인코딩하는 강력한 방법을 제공하지만, LLM이 언제 도움이 되는지, 그리고 숫자 정밀도를 희생하지 않고 어떻게 통합해야 하는지는 불분명합니다. 체계적인 예비 연구를 통해, (1) LLM 임베딩은 입찰과 관련된 정보를 포함하지만 숫자 특징을 대체할 수 없으며, (2) 세분화된 의미-숫자 통합을 통해서만 성능 향상이 나타나며, 단순히 연결하는 것만으로는 효과가 없다는 것을 확인했습니다. 이러한 연구 결과를 바탕으로, 우리는 LLM으로 인코딩된 의미 정보를 오프라인 입찰 경로에 토큰 수준에서 주입하는 새로운 자동 입찰 프레임워크인 SemBid를 제안합니다. SemBid는 세 가지 의미 정보를 입력으로 사용합니다: extit{Task}, extit{History}, 및 extit{Strategy}. 이 의미 정보는 숫자 경로 토큰과 함께 토큰으로 주입되고, 자기 주의(self-attention) 메커니즘을 사용하여 통합되어 제어 가능성과 다양한 목표에 대한 일반화 성능을 향상시킵니다. 다양한 시나리오와 예산 조건에서 SemBid는 오프라인 강화 학습 및 생성적 시퀀스 모델링을 기반으로 하는 기존 방법보다 우수한 성능을 보이며, 전체 성능, 제약 조건 만족도 및 안정성 측면에서 일관된 이점을 제공합니다. 저희 코드는 다음 링크에서 확인하실 수 있습니다: [https://github.com/AlanYu04/SemBid-KDD2026](https://github.com/AlanYu04/SemBid-KDD2026)

Original Abstract

Auto-bidding is a crucial task in real-time advertising markets, where policies must optimize long-horizon value under delivery constraints (e.g., budget and CPA). Existing methods for auto-bidding rely on compact numerical state representations: while they can implicitly capture delivery dynamics, they offer limited support for explicitly representing and controlling high-level intent, evolving feedback, and operator-style strategic guidance in real campaigns. Meanwhile, Large Language Models (LLMs) offer a powerful method for encoding semantic information, it remains unclear when LLMs help and how to integrate them without sacrificing numerical precision. Through systematic preliminary studies, we find that (1) LLM embeddings contain bidding-relevant cues yet cannot replace numerical features, and (2) gains emerge only with careful semantic--numeric integration rather than naive concatenation. Motivated by these findings, we propose \textit{SemBid}, a novel auto-bidding framework that injects LLM-encoded semantics into offline bidding trajectories at the token level. SemBid introduces three semantic inputs: \textit{Task}, \textit{History}, and \textit{Strategy}. It injects these semantics as tokens alongside numerical trajectory tokens and uses self-attention to integrate them, improving controllability and generalization across objectives. Across diverse scenarios and budget regimes, SemBid outperforms competitive baselines from offline RL and generative sequence modeling, with more consistent gains in overall performance, constraint satisfaction, and robustness. Our code is available at: \href{https://github.com/AlanYu04/SemBid-KDD2026}{\textcolor{blue}{here}}.

0 Citations

0 Influential

32.729550745277 Altmetric

163.6 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!