2602.11598v1 Feb 12, 2026 cs.RO

ABot-N0: 다목적 체화된 내비게이션을 위한 VLA 파운데이션 모델 기술 보고서

ABot-N0: Technical Report on the VLA Foundation Model for Versatile Embodied Navigation

Zedong Chu
Zedong Chu
Citations: 41
h-index: 4
Shichao Xie
Shichao Xie
Citations: 61
h-index: 3
Xiaolong Wu
Xiaolong Wu
Citations: 51
h-index: 3
Yanfen Shen
Yanfen Shen
Citations: 4
h-index: 1
Minghua Luo
Minghua Luo
Citations: 21
h-index: 2
Zhengbo Wang
Zhengbo Wang
Citations: 4
h-index: 1
X. Leng
X. Leng
Citations: 19
h-index: 1
Junjun Hu
Junjun Hu
Citations: 42
h-index: 3
Mingyang Yin
Mingyang Yin
Citations: 184
h-index: 8
Jian Lu
Jian Lu
Citations: 18
h-index: 2
Yingnan Guo
Yingnan Guo
Citations: 3
h-index: 1
Kai Yang
Kai Yang
Citations: 14
h-index: 2
Jiawei Han
Jiawei Han
Citations: 329
h-index: 6
Xu Chen
Xu Chen
Citations: 43
h-index: 1
Yanqing Zhu
Yanqing Zhu
Citations: 1
h-index: 1
Yuxiang Zhao
Yuxiang Zhao
Citations: 1
h-index: 1
Xin Liu
Xin Liu
Citations: 11
h-index: 1
Yirong Yang
Yirong Yang
Citations: 37
h-index: 2
Ye He
Ye He
Citations: 5
h-index: 2
Jiahan Wang
Jiahan Wang
Citations: 22
h-index: 3
Yang Cai
Yang Cai
Citations: 41
h-index: 3
Tianlin Zhang
Tianlin Zhang
Citations: 5
h-index: 1
Li Gao
Li Gao
Citations: 0
h-index: 0
Liu Liu
Liu Liu
Citations: 0
h-index: 0
Min-peng Sun
Min-peng Sun
Citations: 0
h-index: 0
Fan Jiang
Fan Jiang
Citations: 20
h-index: 3
Chiyu Wang
Chiyu Wang
Citations: 15
h-index: 2
Zhichen Liu
Zhichen Liu
Citations: 0
h-index: 0
Hong-Ming Pan
Hong-Ming Pan
Citations: 18
h-index: 1
Honglin Han
Honglin Han
Citations: 12
h-index: 2
Zhining Gu
Zhining Gu
Citations: 9
h-index: 2
Kuan Yang
Kuan Yang
Citations: 3
h-index: 1
Jianfang Zhang
Jianfang Zhang
Citations: 18
h-index: 2
D. Jing
D. Jing
Citations: 6
h-index: 2
Zi-An Guan
Zi-An Guan
Citations: 3
h-index: 1
Wei Guo
Wei Guo
Citations: 23
h-index: 2
Guo-qing Liu
Guo-qing Liu
Citations: 2
h-index: 1
Dianzhe Yang
Dianzhe Yang
Citations: 1
h-index: 1
Xiangpo Yang
Xiangpo Yang
Citations: 4
h-index: 1
Meng-Yao Yang
Meng-Yao Yang
Citations: 5
h-index: 1
Hongguang Xing
Hongguang Xing
Citations: 23
h-index: 1
Weiguo Li
Weiguo Li
Citations: 4
h-index: 1
Mu Xu
Mu Xu
Citations: 31
h-index: 3
Fei Liu
Fei Liu
Citations: 5
h-index: 2

체화된 내비게이션은 오랫동안 작업 특화 아키텍처들에 의해 파편화되어 왔다. 우리는 포인트-목표(Point-Goal), 객체-목표(Object-Goal), 지시-따르기(Instruction-Following), POI-목표(POI-Goal), 그리고 사람-따르기(Person-Following) 등 5가지 핵심 작업 전반에 걸쳐 "대통합"을 달성한 통합 시각-언어-행동(VLA) 파운데이션 모델인 ABot-N0를 소개한다. ABot-N0는 의미론적 추론을 위한 LLM 기반 인지 두뇌(Cognitive Brain)와 정밀하고 연속적인 궤적 생성을 위한 플로우 매칭(Flow Matching) 기반 행동 전문가(Action Expert)를 결합한 계층적 "두뇌-행동(Brain-Action)" 아키텍처를 활용한다. 대규모 학습을 지원하기 위해, 우리는 ABot-N0 데이터 엔진을 개발하여 7,802개의 고충실도 3D 장면(10.7 $\text{km}^2$)에 걸쳐 1,690만 개의 전문가 궤적과 500만 개의 추론 샘플을 구축했다. ABot-N0는 특화 모델들을 크게 압도하며 7개의 벤치마크 전반에서 새로운 최고 성능(SOTA)을 달성했다. 더 나아가, 우리의 에이전트형 내비게이션 시스템은 계층적 위상 메모리를 갖춘 플래너를 통합하여 역동적인 실제 환경에서도 견고하고 장기적인(long-horizon) 임무 수행을 가능하게 한다.

Original Abstract

Embodied navigation has long been fragmented by task-specific architectures. We introduce ABot-N0, a unified Vision-Language-Action (VLA) foundation model that achieves a ``Grand Unification'' across 5 core tasks: Point-Goal, Object-Goal, Instruction-Following, POI-Goal, and Person-Following. ABot-N0 utilizes a hierarchical ``Brain-Action'' architecture, pairing an LLM-based Cognitive Brain for semantic reasoning with a Flow Matching-based Action Expert for precise, continuous trajectory generation. To support large-scale learning, we developed the ABot-N0 Data Engine, curating 16.9M expert trajectories and 5.0M reasoning samples across 7,802 high-fidelity 3D scenes (10.7 $\text{km}^2$). ABot-N0 achieves new SOTA performance across 7 benchmarks, significantly outperforming specialized models. Furthermore, our Agentic Navigation System integrates a planner with hierarchical topological memory, enabling robust, long-horizon missions in dynamic real-world environments.

1 Citations
0 Influential
4 Altmetric
21.0 Score

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!