2603.23983v1 Mar 25, 2026 cs.RO

SafeFlow: 물리 기반 수정 흐름 및 선택적 안전 게이팅을 이용한 실시간 텍스트 기반 휴머노이드 전신 제어

SafeFlow: Real-Time Text-Driven Humanoid Whole-Body Control via Physics-Guided Rectified Flow and Selective Safety Gating

Sanghwan Kim

Citations: 69

h-index: 2

Hanbyel Cho

KAIST

Citations: 92

h-index: 5

Jeonguk Kang

Citations: 33

h-index: 2

Donghan Koo

Citations: 59

h-index: 3

최근 실시간 인터랙티브 텍스트 기반 동작 생성 기술의 발전으로 인해 휴머노이드는 다양한 동작을 수행할 수 있게 되었습니다. 그러나 운동학 기반 생성기는 종종 물리적 오류를 발생시키며, 이는 하위 모션 추적 제어기가 따라가기 어렵거나 실제 환경에서 안전하지 않은 동작 경로를 생성할 수 있습니다. 이러한 오류는 실제 로봇 실행을 위한 명시적인 물리 기반 목표의 부족에서 비롯되며, 특히 예상 범위를 벗어난 사용자 입력(OOD)에서 더욱 심각해집니다. 이에, 우리는 물리 기반 동작 생성과 명시적인 위험 지표에 의해 구동되는 3단계 안전 게이팅을 결합한 텍스트 기반 휴머노이드 전신 제어 프레임워크인 SafeFlow를 제안합니다. SafeFlow는 2단계 아키텍처를 채택합니다. 상위 레벨에서는 VAE 잠재 공간에서 물리 기반 수정 흐름 매칭을 사용하여 실제 로봇 실행 가능성을 향상시키고, Reflow를 통해 샘플링 속도를 더욱 가속화하여 실시간 제어를 위한 함수 평가 횟수(NFE)를 줄입니다. 3단계 안전 게이팅은 텍스트 임베딩 공간에서 Mahalanobis 점수를 사용하여 의미론적 OOD 프롬프트를 감지하고, 방향성 민감성 불일치 메트릭을 사용하여 불안정한 생성을 필터링하며, 관절 및 속도 제한과 같은 최종적인 강제적인 운동학적 제약을 적용한 후 생성된 경로를 하위 모션 추적 제어기로 전달합니다. Unitree G1 로봇에 대한 광범위한 실험 결과, SafeFlow는 성공률, 물리적 적합성 및 추론 속도 측면에서 기존의 확산 기반 방법보다 우수한 성능을 보이며, 동시에 다양한 표현력을 유지합니다.

Original Abstract

Recent advances in real-time interactive text-driven motion generation have enabled humanoids to perform diverse behaviors. However, kinematics-only generators often exhibit physical hallucinations, producing motion trajectories that are physically infeasible to track with a downstream motion tracking controller or unsafe for real-world deployment. These failures often arise from the lack of explicit physics-aware objectives for real-robot execution and become more severe under out-of-distribution (OOD) user inputs. Hence, we propose SafeFlow, a text-driven humanoid whole-body control framework that combines physics-guided motion generation with a 3-Stage Safety Gate driven by explicit risk indicators. SafeFlow adopts a two-level architecture. At the high level, we generate motion trajectories using Physics-Guided Rectified Flow Matching in a VAE latent space to improve real-robot executability, and further accelerate sampling via Reflow to reduce the number of function evaluations (NFE) for real-time control. The 3-Stage Safety Gate enables selective execution by detecting semantic OOD prompts using a Mahalanobis score in text-embedding space, filtering unstable generations via a directional sensitivity discrepancy metric, and enforcing final hard kinematic constraints such as joint and velocity limits before passing the generated trajectory to a low-level motion tracking controller. Extensive experiments on the Unitree G1 demonstrate that SafeFlow outperforms prior diffusion-based methods in success rate, physical compliance, and inference speed, while maintaining diverse expressiveness.

1 Citations

0 Influential

2.5 Altmetric

13.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!