2602.11598v1 Feb 12, 2026 cs.RO

ABot-N0: 다목적 체화된 내비게이션을 위한 VLA 파운데이션 모델 기술 보고서

ABot-N0: Technical Report on the VLA Foundation Model for Versatile Embodied Navigation

Zedong Chu

Citations: 91

h-index: 7

Shichao Xie

Citations: 153

h-index: 7

Xiaolong Wu

Citations: 134

h-index: 6

Yanfen Shen

Citations: 22

h-index: 2

Minghua Luo

Citations: 64

h-index: 5

Zhengbo Wang

Citations: 14

h-index: 2

X. Leng

Citations: 29

h-index: 2

Junjun Hu

Citations: 89

h-index: 6

Mingyang Yin

Citations: 206

h-index: 8

Jian Lu

Citations: 29

h-index: 3

Yingnan Guo

Citations: 19

h-index: 2

Kai Yang

Citations: 40

h-index: 4

Jiawei Han

Citations: 200

h-index: 8

Xu Chen

Citations: 60

h-index: 2

Yanqing Zhu

Citations: 10

h-index: 2

Yuxiang Zhao

Citations: 19

h-index: 2

Xin Liu

Citations: 21

h-index: 2

Yirong Yang

Citations: 50

h-index: 3

Ye He

Citations: 15

h-index: 3

Jiahan Wang

Citations: 36

h-index: 4

Yang Cai

Citations: 60

h-index: 5

Tianlin Zhang

Citations: 16

h-index: 2

Li Gao

Citations: 8

h-index: 1

Liu Liu

Citations: 10

h-index: 2

Min-peng Sun

Citations: 8

h-index: 1

Fan Jiang

Citations: 30

h-index: 4

Chiyu Wang

Citations: 50

h-index: 4

Zhichen Liu

Citations: 8

h-index: 1

Hong-Ming Pan

Citations: 28

h-index: 2

Honglin Han

Citations: 29

h-index: 3

Zhining Gu

Citations: 30

h-index: 3

Kuan Yang

Citations: 19

h-index: 2

Jianfang Zhang

Citations: 32

h-index: 3

D. Jing

Citations: 14

h-index: 2

Zi-An Guan

Citations: 11

h-index: 2

Wei Guo

Citations: 34

h-index: 2

Guo-qing Liu

Citations: 11

h-index: 2

Dianzhe Yang

Citations: 9

h-index: 1

Xiangpo Yang

Citations: 14

h-index: 2

Meng-Yao Yang

Citations: 13

h-index: 2

Hongguang Xing

Citations: 31

h-index: 2

Weiguo Li

Citations: 13

h-index: 2

Mu Xu

Citations: 82

h-index: 6

Fei Liu

Citations: 39

h-index: 5

체화된 내비게이션은 오랫동안 작업 특화 아키텍처들에 의해 파편화되어 왔다. 우리는 포인트-목표(Point-Goal), 객체-목표(Object-Goal), 지시-따르기(Instruction-Following), POI-목표(POI-Goal), 그리고 사람-따르기(Person-Following) 등 5가지 핵심 작업 전반에 걸쳐 "대통합"을 달성한 통합 시각-언어-행동(VLA) 파운데이션 모델인 ABot-N0를 소개한다. ABot-N0는 의미론적 추론을 위한 LLM 기반 인지 두뇌(Cognitive Brain)와 정밀하고 연속적인 궤적 생성을 위한 플로우 매칭(Flow Matching) 기반 행동 전문가(Action Expert)를 결합한 계층적 "두뇌-행동(Brain-Action)" 아키텍처를 활용한다. 대규모 학습을 지원하기 위해, 우리는 ABot-N0 데이터 엔진을 개발하여 7,802개의 고충실도 3D 장면(10.7 $\text{km}^2$)에 걸쳐 1,690만 개의 전문가 궤적과 500만 개의 추론 샘플을 구축했다. ABot-N0는 특화 모델들을 크게 압도하며 7개의 벤치마크 전반에서 새로운 최고 성능(SOTA)을 달성했다. 더 나아가, 우리의 에이전트형 내비게이션 시스템은 계층적 위상 메모리를 갖춘 플래너를 통합하여 역동적인 실제 환경에서도 견고하고 장기적인(long-horizon) 임무 수행을 가능하게 한다.

Original Abstract

Embodied navigation has long been fragmented by task-specific architectures. We introduce ABot-N0, a unified Vision-Language-Action (VLA) foundation model that achieves a ``Grand Unification'' across 5 core tasks: Point-Goal, Object-Goal, Instruction-Following, POI-Goal, and Person-Following. ABot-N0 utilizes a hierarchical ``Brain-Action'' architecture, pairing an LLM-based Cognitive Brain for semantic reasoning with a Flow Matching-based Action Expert for precise, continuous trajectory generation. To support large-scale learning, we developed the ABot-N0 Data Engine, curating 16.9M expert trajectories and 5.0M reasoning samples across 7,802 high-fidelity 3D scenes (10.7 $\text{km}^2$). ABot-N0 achieves new SOTA performance across 7 benchmarks, significantly outperforming specialized models. Furthermore, our Agentic Navigation System integrates a planner with hierarchical topological memory, enabling robust, long-horizon missions in dynamic real-world environments.

9 Citations

0 Influential

4 Altmetric

29.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!