2602.15060v2 Feb 13, 2026 cs.RO

CLOT: 전체 로봇형 인간형 로봇 원격 제어를 위한 폐루프 글로벌 모션 추적

CLOT: Closed-Loop Global Motion Tracking for Whole-Body Humanoid Teleoperation

Tengjie Zhu

Citations: 6

h-index: 2

G. Cai

Citations: 5

h-index: 1

Zhaohui Yang

Citations: 136

h-index: 4

Guanzhu Ren

Citations: 4

h-index: 1

Hao-Ran Xie

Citations: 23

h-index: 4

Zirui Wang

Citations: 6,190

h-index: 21

Jun Wu

Citations: 15

h-index: 2

Jingbo Wang

Citations: 124

h-index: 4

Xiaokang Yang

Citations: 375

h-index: 6

Yao Mu

Citations: 112

h-index: 5

Yi-ke Yan

Citations: 5

h-index: 1

장시간의 전체 로봇형 인간형 로봇 원격 제어는, 특히 대형 인간형 로봇의 경우, 누적되는 글로벌 자세 드리프트로 인해 여전히 어려운 과제입니다. 최근의 학습 기반 추적 방법은 민첩하고 조화로운 움직임을 가능하게 하지만, 일반적으로 로봇의 로컬 좌표계를 사용하고 글로벌 자세 피드백을 무시하여 장시간 동작 중에 드리프트와 불안정을 초래합니다. 본 연구에서는 고주파 로컬라이제이션 피드백을 통해 폐루프 글로벌 모션 추적을 달성하는 실시간 전체 로봇형 인간형 로봇 원격 제어 시스템인 CLOT를 제시합니다. CLOT는 운영자와 로봇의 자세를 폐루프 방식으로 동기화하여 장시간 동안 드리프트 없는 인간-인간형 로봇 모방을 가능하게 합니다. 그러나 강화 학습에서 글로벌 추적 보상을 직접 적용하는 것은 종종 공격적이고 불안정한 보정을 초래합니다. 이를 해결하기 위해, 우리는 관측 경로와 보상 평가를 분리하는 데이터 기반 랜덤화 전략을 제안하여 부드럽고 안정적인 글로벌 보정을 가능하게 합니다. 또한, 우리는 정책을 적대적 모션 사전으로 정규화하여 부자연스러운 동작을 억제합니다. CLOT를 지원하기 위해, 우리는 인간형 로봇 제어 정책 학습을 위한 20시간의 정교하게 선별된 인간 동작 데이터를 수집했습니다. 우리는 트랜스포머 기반 정책을 설계하고 1300시간 이상의 GPU 시간을 사용하여 학습했습니다. 이 정책은 31 자유도(손 제외)를 가진 대형 인간형 로봇에 적용되었습니다. 시뮬레이션 및 실제 환경 실험을 통해 고속 동작, 고정밀 추적 및 시뮬레이션-실제 인간형 로봇 원격 제어에서의 높은 안정성을 확인했습니다. 동작 데이터, 데모 및 코드는 저희 웹사이트에서 확인할 수 있습니다.

Original Abstract

Long-horizon whole-body humanoid teleoperation remains challenging due to accumulated global pose drift, particularly on full-sized humanoids. Although recent learning-based tracking methods enable agile and coordinated motions, they typically operate in the robot's local frame and neglect global pose feedback, leading to drift and instability during extended execution. In this work, we present CLOT, a real-time whole-body humanoid teleoperation system that achieves closed-loop global motion tracking via high-frequency localization feedback. CLOT synchronizes operator and robot poses in a closed loop, enabling drift-free human-to-humanoid mimicry over long timehorizons. However, directly imposing global tracking rewards in reinforcement learning, often results in aggressive and brittle corrections. To address this, we propose a data-driven randomization strategy that decouples observation trajectories from reward evaluation, enabling smooth and stable global corrections. We further regularize the policy with an adversarial motion prior to suppress unnatural behaviors. To support CLOT, we collect 20 hours of carefully curated human motion data for training the humanoid teleoperation policy. We design a transformer-based policy and train it for over 1300 GPU hours. The policy is deployed on a full-sized humanoid with 31 DoF (excluding hands). Both simulation and real-world experiments verify high-dynamic motion, high-precision tracking, and strong robustness in sim-to-real humanoid teleoperation. Motion data, demos and code can be found in our website.

4 Citations

0 Influential

10.5 Altmetric

56.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!