2603.01973v1 Mar 02, 2026 cs.CL

CharacterFlywheel: 생산 환경에서 매력적이고 제어 가능한 LLM의 반복적 개선을 위한 확장 전략

CharacterFlywheel: Scaling Iterative Improvement of Engaging and Steerable LLMs in Production

L. Guan

Citations: 1,125

h-index: 11

Lin Chen

Citations: 314

h-index: 5

Yixin Nie

Citations: 29

h-index: 2

Shigan Chu

Citations: 0

h-index: 0

Ajay Thampi

Citations: 44

h-index: 4

Wancen Mu

Citations: 64

h-index: 1

Nathan Shuster

Citations: 0

h-index: 0

Ke Wang

Citations: 43

h-index: 1

Jason Brewer

Citations: 31

h-index: 1

Alexander McCauley

Citations: 813

h-index: 1

Jason Weston

Citations: 981

h-index: 11

Kevin Tang

Citations: 8

h-index: 2

Zhe Zhou

Citations: 26

h-index: 2

Zhongyao Ma

Citations: 5

h-index: 1

Anchit Gupta

Citations: 1,758

h-index: 11

Yipin Zhou

Citations: 78

h-index: 2

Xiao Li

Citations: 43

h-index: 4

R. Zeng

Citations: 23

h-index: 3

Ge Zhou

Citations: 48

h-index: 3

D. Hu

Citations: 1,809

h-index: 16

Sem Park

Citations: 97

h-index: 5

Naisheng Zhang

Citations: 1

h-index: 1

본 보고서는 Instagram, WhatsApp, Messenger 등 생산 환경의 소셜 채팅 애플리케이션에서 사용되는 대규모 언어 모델(LLM)을 개선하기 위한 반복적인 프로세스인 CharacterFlywheel을 소개합니다. LLaMA 3.1을 기반으로, 내부 및 외부 실제 사용자 데이터를 활용하여 15세대 모델을 개선했습니다. 2024년 7월부터 2025년 4월까지 지속적인 배포를 통해 7일간의 통제된 A/B 테스트를 수행한 결과, 일관된 참여도 개선 효과를 확인했습니다. 8개의 새로 배포된 모델 중 7개가 기준 모델보다 긍정적인 개선을 보였으며, 가장 뛰어난 성능을 보인 모델은 참여도 범위에서 최대 8.8%, 참여도 깊이에서 최대 19.4%의 개선을 달성했습니다. 또한, 지시 수행률이 59.2%에서 84.8%로, 지시 위반율이 26.6%에서 5.8%로 감소하는 등 지시 제어 능력에서도 상당한 향상을 보였습니다. CharacterFlywheel 프로세스는 데이터 큐레이션, 참여도 지표의 추정 및 보간을 위한 보상 모델링, 지도 학습(SFT), 강화 학습(RL), 그리고 오프라인 및 온라인 평가를 통합하여 각 최적화 단계에서 신뢰할 수 있는 진행을 보장합니다. 또한, 과적합 방지 및 대규모 생산 환경에서의 운영 전략에 대한 내용도 다룹니다. 이러한 연구 결과는 수백만 명의 사용자를 대상으로 하는 소셜 애플리케이션에서의 LLM에 대한 과학적 엄밀성과 이해를 높이는 데 기여합니다.

Original Abstract

This report presents CharacterFlywheel, an iterative flywheel process for improving large language models (LLMs) in production social chat applications across Instagram, WhatsApp, and Messenger. Starting from LLaMA 3.1, we refined models across 15 generations using data from both internal and external real-user traffic. Through continuous deployments from July 2024 to April 2025, we conducted controlled 7-day A/B tests showing consistent engagement improvements: 7 of 8 newly deployed models demonstrated positive lift over the baseline, with the strongest performers achieving up to 8.8% improvement in engagement breadth and 19.4% in engagement depth. We also observed substantial gains in steerability, with instruction following increasing from 59.2% to 84.8% and instruction violations decreasing from 26.6% to 5.8%. We detail the CharacterFlywheel process which integrates data curation, reward modeling to estimate and interpolate the landscape of engagement metrics, supervised fine-tuning (SFT), reinforcement learning (RL), and both offline and online evaluation to ensure reliable progress at each optimization step. We also discuss our methods for overfitting prevention and navigating production dynamics at scale. These contributions advance the scientific rigor and understanding of LLMs in social applications serving millions of users.

0 Citations

0 Influential

8 Altmetric

40.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!