2606.08952v1 Jun 08, 2026 cs.AI

AlloSpatial: Agentic Harness Framework for Spatial Reasoning in Foundation Models

Zhenyu Wu
Zhenyu Wu
Citations: 1,521
h-index: 9
Shouwei Ruan
Shouwei Ruan
Citations: 335
h-index: 8
Qihui Zhu
Qihui Zhu
Citations: 12
h-index: 1
Yubin Wang
Yubin Wang
Citations: 4
h-index: 1
Bin Wang
Bin Wang
Citations: 34
h-index: 3
Yuxiang Zhang
Yuxiang Zhang
Citations: 8
h-index: 2
Xingxing Wei
Xingxing Wei
Citations: 271
h-index: 8
Jingzhi Li
Jingzhi Li
Citations: 40
h-index: 2

Multimodal Foundation Models (MFMs) have made substantial progress, yet remain fragile in spatial reasoning over the physical world. A key bottleneck lies in their inability to transform local egocentric observations into a global allocentric spatial representation. To address this, we propose AlloSpatial, an agentic framework for allocentric spatial cognition in foundation models. AlloSpatial introduces World2Mind, a plug-and-play cognitive mapping sandbox that converts egocentric observations into structured allocentric priors, including Allocentric-Spatial Trees and route maps that support querying object topology, geometric relations, passability, and trajectories. To utilize these priors reliably under noisy reconstruction and ambiguous visual evidence, AlloSpatial introduces a Spatial Reasoning Harness for tool-use judgment, modality-decoupled cue collection, and geometry-semantic arbitration. We further internalize this process in Qwen3-VL through cold-start reinforcement learning with a harness-gated trajectory-level reward. Experiments on VSI-Bench and MindCube show that AlloSpatial improves proprietary models by 5%-18% in a training-free setting, while ASTs alone support strong spatial reasoning even when visual inputs are removed. The trained AlloSpatial agents further outperform larger general-purpose models and competitive spatial baselines, suggesting that structured allocentric representations, active tool use, and verifiable reasoning offer a promising route toward spatially capable foundation models.

0 Citations
0 Influential
4.5 Altmetric
22.5 Score
Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

Log in to request an AI analysis.

댓글

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!