2603.19710v1 Mar 20, 2026 cs.IR

AIGQ: 전자상거래 검색어 추천을 위한 엔드투엔드 하이브리드 생성적 아키텍처

AIGQ: An End-to-End Hybrid Generative Architecture for E-commerce Query Recommendation

Jingcao Xu

Citations: 174

h-index: 5

Jianyu Zou

Citations: 18

h-index: 2

Renkai Yang

Citations: 13

h-index: 1

Zili Geng

Citations: 120

h-index: 2

Qiang Liu

Citations: 4

h-index: 1

Haihong Tang

Citations: 52

h-index: 4

사전 검색어 추천(예: 타오바오 홈페이지의 힌트Q)은 사용자의 의도 파악과 수요 발견에 중요한 역할을 하지만, 기존 방식은 ID 기반 매칭 및 코클릭 휴리스틱에 의존하기 때문에 의미론적 깊이가 부족하고, 콜드 스타트 성능이 낮으며, 예상치 못한 추천이 부족하다는 단점이 있습니다. 이러한 문제점을 해결하기 위해, 본 연구에서는 힌트Q 시나리오를 위한 최초의 엔드투엔드 생성 프레임워크인 AIGQ(AI-Generated Query architecture)를 제안합니다. AIGQ는 학습 패러다임, 정책 최적화 및 배포 아키텍처를 아우르는 세 가지 핵심 혁신을 기반으로 구축되었습니다. 첫째, 세션 기반 행동 집계 및 관심사 기반 재순위 전략을 통해 학습 샘플을 구성하는 목록 수준의 지도 학습 접근 방식인 Interest-Aware List Supervised Fine-Tuning (IL-SFT)을 제안합니다. 이는 미묘한 사용자 의도를 정확하게 모델링하는 데 도움이 됩니다. 둘째, 개별 검색어의 관련성과 전체 목록의 특성을 동시에 최적화하는 이중 구성 보상 메커니즘을 갖춘 새로운 정책 경사 알고리즘인 Interest-aware List Group Relative Policy Optimization (IL-GRPO)을 설계했습니다. 이 알고리즘은 온라인 클릭률(CTR) 순위 모델에서 얻은 모델 기반 보상을 활용하여 성능을 향상시킵니다. 셋째, 엄격한 실시간 및 낮은 지연 시간 요구 사항을 충족하기 위해, AIGQ-Direct (온라인 사용자-검색어 생성) 및 AIGQ-Think (추론 기능을 강화하여 관심사 다양성을 풍부하게 하는 변형)를 포함하는 하이브리드 오프라인-온라인 아키텍처를 개발했습니다. 타오바오에서 수행된 광범위한 오프라인 평가 및 대규모 온라인 A/B 테스트 결과, AIGQ는 플랫폼 효율성 및 사용자 참여를 포함한 주요 비즈니스 지표에서 일관되게 상당한 성능 향상을 보여주었습니다.

Original Abstract

Pre-search query recommendation, widely known as HintQ on Taobao's homepage, plays a vital role in intent capture and demand discovery, yet traditional methods suffer from shallow semantics, poor cold-start performance and low serendipity due to reliance on ID-based matching and co-click heuristics. To overcome these challenges, we propose AIGQ (AI-Generated Query architecture), the first end-to-end generative framework for HintQ scenario. AIGQ is built upon three core innovations spanning training paradigm, policy optimization and deployment architecture. First, we propose Interest-Aware List Supervised Fine-Tuning (IL-SFT), a list-level supervised learning approach that constructs training samples through session-aware behavior aggregation and interest-guided re-ranking strategy to faithfully model nuanced user intent. Accordingly, we design Interest-aware List Group Relative Policy Optimization (IL-GRPO), a novel policy gradient algorithm with a dual-component reward mechanism that jointly optimizes individual query relevance and global list properties, enhanced by a model-based reward from the online click-through rate (CTR) ranking model. To deploy under strict real-time and low-latency requirements, we further develop a hybrid offline-online architecture comprising AIGQ-Direct for nearline personalized user-to-query generation and AIGQ-Think, a reasoning-enhanced variant that produces trigger-to-query mappings to enrich interest diversity. Extensive offline evaluations and large-scale online A/B experiments on Taobao demonstrate that AIGQ consistently delivers substantial improvements in key business metrics across platform effectiveness and user engagement.

0 Citations

0 Influential

2.5 Altmetric

12.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!