2605.26114v1 May 25, 2026 cs.AI

MobileGym: A Verifiable and Highly Parallel Simulation Platform for Mobile GUI Agent Research

Haiyang Wang

Citations: 20

h-index: 2

Shuzhe Wu

Citations: 18

h-index: 2

Han Xiao

Citations: 159

h-index: 5

Di Wu

Citations: 70

h-index: 5

Ruitao Hao

Citations: 23

h-index: 2

Zheng Li

Citations: 172

h-index: 3

Boji Zhou

Citations: 41

h-index: 2

Zheng Ju

Citations: 2

h-index: 1

Zichen Liu

Citations: 0

h-index: 0

Lu Fan

Citations: 0

h-index: 0

Zhaoxiang Zhang

Citations: 34

h-index: 2

We present MobileGym, a browser-hosted, lightweight, fully controllable environment for everyday mobile use, targeting interaction fidelity without replicating proprietary backends. It enables two capabilities previously out of reach for everyday apps: verifiable outcome signals through deterministic state-based judging over structured JSON state, and scalable online RL through low-cost parallel rollouts. The full environment state is captured, configured, forked, and compared as structured JSON, and a single server can host hundreds of parallel instances, with about 400 MB memory per instance and about 3 s cold start. A layered state model and a declarative task-definition framework keep state programmability and task creation practical at scale, and a single programmatic judging mechanism delivers both deterministic evaluation verdicts and dense RL rewards. The accompanying MobileGym-Bench provides 416 parameterized task templates, including 256 test and 160 train templates, over 28 apps, with deterministic judges and a structured AnswerSheet protocol that avoids free-text matching failures. In a Sim-to-Real case study, GRPO on Qwen3-VL-4B-Instruct gains +12.8 percentage points on the 256-task test set, and on a 59-task real-device signal subset, real-device execution retains 95.1% of the simulation-side training gain. Project page: https://mobilegym.github.io.

0 Citations

0 Influential

2.5 Altmetric

12.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!