2601.07718v1 Jan 12, 2026 cs.RO

야생에서의 하이킹: 확장 가능한 인지 기반 파쿠르 프레임워크 (인형 로봇)

Hiking in the Wild: A Scalable Perceptive Parkour Framework for Humanoids

Shaoting Zhu

Citations: 144

h-index: 7

Ziwen Zhuang

Citations: 463

h-index: 5

Hang Zhao

Citations: 452

h-index: 4

Mengjie Zhao

Citations: 10

h-index: 2

Kunhee Lee

Citations: 7

h-index: 1

복잡하고 구조화되지 않은 환경에서 안정적인 인형 로봇 하이킹을 구현하기 위해서는 반응적인 고유수용성에서 능동적인 인지 능력으로 전환해야 합니다. 그러나 외부 감각 정보를 통합하는 것은 여전히 중요한 과제입니다. 지도 기반 방법은 상태 추정 오차로 인해 어려움을 겪으며, 예를 들어 LiDAR 기반 방법은 몸통의 흔들림에 취약합니다. 기존의 엔드 투 엔드 접근 방식은 종종 확장성과 학습 복잡성에서 어려움을 겪으며, 특히 일부 이전 연구에서 사용된 가상 장애물은 개별 사례에 맞춰 구현되었습니다. 본 연구에서는 확장 가능한 엔드 투 엔드 파쿠르 인지 프레임워크인 "야생에서의 하이킹"을 제시합니다. 안전성과 학습 안정성을 보장하기 위해, 우리는 두 가지 핵심 메커니즘을 도입했습니다. 첫째, 확장 가능한 "지형 가장자리 감지"와 "발 부피 포인트"를 결합한 발 지지 안전 메커니즘으로, 가장자리에 의한 치명적인 미끄러짐을 방지합니다. 둘째, "평탄 영역 샘플링" 전략을 통해 보상 해킹을 완화하고 실행 가능한 탐색 목표를 생성합니다. 우리의 접근 방식은 단일 단계 강화 학습 체계를 사용하며, 외부 상태 추정에 의존하지 않고 원시 깊이 입력과 고유수용성을 직접적으로 관절 동작으로 매핑합니다. 실제 크기의 인형 로봇을 대상으로 수행한 광범위한 현장 실험 결과, 우리의 정책은 최대 2.5 m/s의 속도로 복잡한 지형을 안정적으로 통과할 수 있음을 보여줍니다. 학습 및 배포 코드는 오픈 소스로 제공되어 재현 가능한 연구를 촉진하고 최소한의 하드웨어 수정으로 실제 로봇에 배포할 수 있도록 지원합니다.

Original Abstract

Achieving robust humanoid hiking in complex, unstructured environments requires transitioning from reactive proprioception to proactive perception. However, integrating exteroception remains a significant challenge: mapping-based methods suffer from state estimation drift; for instance, LiDAR-based methods do not handle torso jitter well. Existing end-to-end approaches often struggle with scalability and training complexity; specifically, some previous works using virtual obstacles are implemented case-by-case. In this work, we present \textit{Hiking in the Wild}, a scalable, end-to-end parkour perceptive framework designed for robust humanoid hiking. To ensure safety and training stability, we introduce two key mechanisms: a foothold safety mechanism combining scalable \textit{Terrain Edge Detection} with \textit{Foot Volume Points} to prevent catastrophic slippage on edges, and a \textit{Flat Patch Sampling} strategy that mitigates reward hacking by generating feasible navigation targets. Our approach utilizes a single-stage reinforcement learning scheme, mapping raw depth inputs and proprioception directly to joint actions, without relying on external state estimation. Extensive field experiments on a full-size humanoid demonstrate that our policy enables robust traversal of complex terrains at speeds up to 2.5 m/s. The training and deployment code is open-sourced to facilitate reproducible research and deployment on real robots with minimal hardware modifications.

8 Citations

0 Influential

3.5 Altmetric

25.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!