2605.29563v1 May 28, 2026 cs.AI

Planning with the Views via Scene Self-Exploration

Manling Li
Manling Li
Citations: 47
h-index: 3
Fei-Fei Li
Fei-Fei Li
Citations: 1,380
h-index: 15
Jiajun Wu
Jiajun Wu
Citations: 655
h-index: 9
Zhengyuan Yang
Zhengyuan Yang
Citations: 9,496
h-index: 36
Lijuan Wang
Lijuan Wang
Citations: 303
h-index: 4
Kangrui Wang
Kangrui Wang
Citations: 849
h-index: 10
Linjie Li
Linjie Li
Microsoft
Citations: 16,575
h-index: 41
Shiqi Chen
Shiqi Chen
Citations: 888
h-index: 11
Zihan Wang
Zihan Wang
Citations: 9
h-index: 2
Leonidas J. Guibas
Leonidas J. Guibas
Citations: 7
h-index: 2

Can VLMs predict how each camera move changes the view, and plan many such moves ahead? We call this capability view planning, requiring (1)understanding how a single action transforms the view, and (2)composing many such transformations across multi-turn plans to identify a target view. We probe both abilities in our proposed ViewSuite, a 3D point-cloud environment on real ScanNet scenes. Across 13 frontier VLMs, a critical planning gap emerges: they possess basic view-action knowledge but fail to compose it across multi-turn plans, with the gap widening as viewpoint distance grows. To close this gap, we propose an iterative framework that alternates self-exploration with view graph distillation. The key insight is that all exploration trajectories, regardless of their outcome, collectively form a view graph that compactly captures how viewpoints connect across a scene. Distilling this graph into diverse supervised tasks reshapes the policy distribution and overcomes the sparse rewards that stall pure RL. This improves Qwen2.5-VL-7B from 2.5% to 47.8% on interactive view planning, surpassing GPT-5.4 Pro (18.5%) and Gemini 3.1 Pro (21.4%). Self-exploration emerges as a promising path toward VLMs that can actively reason and plan in 3D space.

0 Citations
0 Influential
20.5 Altmetric
102.5 Score
Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

Log in to request an AI analysis.

댓글

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!