2602.12978v1 Feb 13, 2026 cs.RO

행동 청킹 흐름 정책을 위한 네이티브 연속 학습

Learning Native Continuation for Action Chunking Flow Policies

Yufeng Liu

Citations: 7

h-index: 1

Hang Yu

Citations: 75

h-index: 6

Juntu Zhao

Citations: 23

h-index: 3

Bocheng Li

Citations: 118

h-index: 4

Di Zhang

Citations: 22

h-index: 3

Ming-Zhe Li

Citations: 35

h-index: 3

Wenxuan Wu

Citations: 2,097

h-index: 8

Yingdong Hu

Citations: 974

h-index: 13

Junyuan Xie

Citations: 22

h-index: 3

Junliang Guo

Citations: 246

h-index: 5

Dequan Wang

Citations: 16

h-index: 2

Yang Gao

Citations: 908

h-index: 16

행동 청킹은 비전-언어-행동(VLA) 모델이 실시간으로 작동할 수 있도록 하지만, 단순한 청킹 방식은 종종 청킹 경계에서 불연속성을 보입니다. 실시간 청킹(RTC)은 이러한 문제를 완화하지만, 정책 외부에서 작동하여 부적절한 다중 모드 전환과 본질적으로 부드럽지 않은 궤적을 초래합니다. 본 논문에서는 행동 청킹 기반 VLA 정책을 위한 학습 시간 연속 학습 방법인 Legato를 제안합니다. 구체적으로, Legato는 알려진 행동과 노이즈의 일정 형태 혼합에서 디노이징을 초기화하여 모델에 부분적인 행동 정보를 제공합니다. 또한, Legato는 학습된 흐름 동역학을 재구성하여 단계별 지침 하에서 학습과 추론 과정에서 디노이징 프로세스가 일관성을 유지하도록 합니다. Legato는 또한 학습 중에 무작위 일정을 사용하여 다양한 추론 지연을 지원하고 제어 가능한 부드러움을 달성합니다. 실험적으로, Legato는 더 부드러운 궤적을 생성하고 실행 중에 부적절한 다중 모드 전환을 줄여, 결과적으로 머뭇거림이 줄어들고 작업 완료 시간이 단축됩니다. 광범위한 실제 실험에서, Legato는 5가지 조작 작업에서 RTC보다 일관되게 우수한 성능을 보이며, 궤적의 부드러움과 작업 완료 시간 모두 약 10%의 성능 향상을 달성합니다.

Original Abstract

Action chunking enables Vision Language Action (VLA) models to run in real time, but naive chunked execution often exhibits discontinuities at chunk boundaries. Real-Time Chunking (RTC) alleviates this issue but is external to the policy, leading to spurious multimodal switching and trajectories that are not intrinsically smooth. We propose Legato, a training-time continuation method for action-chunked flow-based VLA policies. Specifically, Legato initializes denoising from a schedule-shaped mixture of known actions and noise, exposing the model to partial action information. Moreover, Legato reshapes the learned flow dynamics to ensure that the denoising process remains consistent between training and inference under per-step guidance. Legato further uses randomized schedule condition during training to support varying inference delays and achieve controllable smoothness. Empirically, Legato produces smoother trajectories and reduces spurious multimodal switching during execution, leading to less hesitation and shorter task completion time. Extensive real-world experiments show that Legato consistently outperforms RTC across five manipulation tasks, achieving approximately 10% improvements in both trajectory smoothness and task completion time.

8 Citations

1 Influential

8 Altmetric

50.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!