2602.13718v1 Feb 14, 2026 cs.RO

HybridFlow: 로봇 조작을 위한 2단계 생성 정책

HybridFlow: A Two-Step Generative Policy for Robotic Manipulation

Jiamin Wu

Citations: 246

h-index: 5

Zhen Dong

Citations: 144

h-index: 4

Jinna Fu

Citations: 2

h-index: 1

Shengyuan Yu

Citations: 18

h-index: 2

Fulin Chen

Citations: 21

h-index: 1

Yide Liu

Citations: 18

h-index: 3

기존의 로봇 조작 정책은 추론 지연으로 인해 환경과의 충분한 실시간 상호 작용 능력이 부족합니다. 플로우 매칭과 같은 더 빠른 생성 방법이 점진적으로 확산 방법들을 대체하고 있지만, 연구자들은 여전히 실시간 로봇 제어에 적합한 더욱 빠른 생성 방법을 모색하고 있습니다. 플로우 매칭의 일종인 MeanFlow는 이미지 생성에서 강력한 잠재력을 보여주었지만, 로봇 조작의 엄격한 요구 사항을 충족할 만큼의 정밀도를 제공하지 못합니다. 따라서 본 연구에서는 MeanFlow의 빠른 장점과 ReFlow 모드를 활용하여 추론 속도와 생성 품질의 균형을 맞추고, 최소한의 생성 단계로 정확한 액션을 보장하는 3단계 방법인 HybridFlow를 제안합니다. 구체적으로, HybridFlow는 MeanFlow 모드의 글로벌 점프, 분포 정렬을 위한 ReNoise, 그리고 ReFlow 모드의 로컬 정제 단계를 포함합니다. 실제 실험을 통해 HybridFlow는 16단계 확산 정책보다 성공률이 15~25% 향상되었으며, 추론 시간을 152ms에서 19ms로 단축하여 8배의 속도 향상 및 약 52Hz의 처리 속도를 달성했습니다. 또한, HybridFlow는 새로운 색상 환경에서의 물체 잡기 작업에서 70.0%의 성공률, 변형 가능한 물체 접기 작업에서 66.3%의 성공률을 보였습니다. 본 연구는 HybridFlow가 로봇 조작 정책의 실시간 상호 작용 능력을 향상시키는 실용적인 저지연 방법으로 활용될 수 있을 것으로 기대합니다.

Original Abstract

Limited by inference latency, existing robot manipulation policies lack sufficient real-time interaction capability with the environment. Although faster generation methods such as flow matching are gradually replacing diffusion methods, researchers are pursuing even faster generation suitable for interactive robot control. MeanFlow, as a one-step variant of flow matching, has shown strong potential in image generation, but its precision in action generation does not meet the stringent requirements of robotic manipulation. We therefore propose \textbf{HybridFlow}, a \textbf{3-stage method} with \textbf{2-NFE}: Global Jump in MeanFlow mode, ReNoise for distribution alignment, and Local Refine in ReFlow mode. This method balances inference speed and generation quality by leveraging the rapid advantage of MeanFlow one-step generation while ensuring action precision with minimal generation steps. Through real-world experiments, HybridFlow outperforms the 16-step Diffusion Policy by \textbf{15--25\%} in success rate while reducing inference time from 152ms to 19ms (\textbf{8$\times$ speedup}, \textbf{$\sim$52Hz}); it also achieves 70.0\% success on unseen-color OOD grasping and 66.3\% on deformable object folding. We envision HybridFlow as a practical low-latency method to enhance real-world interaction capabilities of robotic manipulation policies.

1 Citations

0 Influential

2.5 Altmetric

13.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!