2605.28258v1 May 27, 2026 cs.SE

GUI Agents for Continual Game Generation

Yuanzhe Shen
Yuanzhe Shen
Citations: 40
h-index: 3
Ruihan Yang
Ruihan Yang
Citations: 604
h-index: 8
Bo Li
Bo Li
Citations: 198
h-index: 3
Qingyi Si
Qingyi Si
Citations: 12
h-index: 1
Hongcheng Guo
Hongcheng Guo
Citations: 12
h-index: 1
Haonan Ge
Haonan Ge
Citations: 41
h-index: 4
Yixu Huang
Yixu Huang
Citations: 2
h-index: 1
Na Li
Na Li
Citations: 332
h-index: 11
Zhe Wang
Zhe Wang
Citations: 127
h-index: 2
Kai Chen
Kai Chen
Citations: 33
h-index: 3
Guangjin Wang
Guangjin Wang
Citations: 25
h-index: 2

Generating a game is not the same as making one that can be played. Despite advances in code generation, existing approaches treat game generation as one-shot translation from prompt to artifact, leaving interaction-level failures undetected. We argue that evaluating and improving game generation requires a player, and study two roles for graphical user interface (GUI) agents in this process: (1) as an objective evaluator, for which we introduce PlaytestArena, a new evaluation environment that pairs 200 browser-based game generation tasks across eight genres with rubrics of expected in-play behaviors, adjudicated by a GUI agent that loads each build in a browser and plays it; and (2) as a subjective playtester, for which we propose Play2Code, where a game agent and a GUI agent operate in a sustained loop with shared memory, turning game generation into a dialogue between coding and playing. Our experiments show that even frontier models struggle to generate playable games directly, while Play2Code achieves a 66.8\% rubric pass-rate, improving over single-pass and agentic-coding baselines by 37.1 and 14.6 points respectively. Further analysis shows that GUI playtester feedback is more traceable than a human report, yet idiosyncratic in ways reminiscent of human testers, establishing game playtesting as a critical testbed for interactive code generation. Our project website is available at https://continual-game-generation.vercel.app/.

2 Citations
0 Influential
5.5 Altmetric
29.5 Score
Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

Log in to request an AI analysis.

댓글

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!