2604.03131v1 Apr 03, 2026 cs.CR

OpenClaw 및 그 변종에 대한 체계적인 보안 평가

A Systematic Security Evaluation of OpenClaw and Its Variants

Haichang Gao

Citations: 27

h-index: 2

Zhenxing Niu

Citations: 46

h-index: 3

Yuhang Wang

Citations: 27

h-index: 2

Shiguo Lian

Citations: 309

h-index: 6

Zhaoxiang Liu

Citations: 29

h-index: 3

Wenjing Zhang

Citations: 208

h-index: 5

Xiang Wang

Citations: 403

h-index: 10

도구 기반 AI 에이전트는 대규모 언어 모델의 실질적인 기능을 크게 확장하지만, 모델 자체 평가로는 식별할 수 없는 보안 위험을 초래합니다. 본 논문에서는 OpenClaw, AutoClaw, QClaw, KimiClaw, MaxClaw, ArkClaw과 같은 6가지 대표적인 OpenClaw 시리즈 에이전트 프레임워크를 다양한 기반 모델 하에서 체계적으로 보안 평가합니다. 본 연구를 지원하기 위해, 전체 에이전트 실행 라이프사이클에 걸쳐 대표적인 공격 행동을 포괄하는 205개의 테스트 케이스로 구성된 벤치마크를 구축하여, 프레임워크 및 모델 수준 모두에서 위험 노출을 통합적으로 평가합니다. 연구 결과, 평가된 모든 에이전트가 상당한 보안 취약점을 가지고 있으며, 에이전트 기반 시스템은 개별적으로 사용되는 기본 모델보다 훨씬 더 위험하다는 것을 보여줍니다. 특히, 정보 수집 및 탐색 행동이 가장 흔한 취약점으로 나타났으며, 각 프레임워크는 자격 증명 유출, 측면 이동, 권한 상승, 리소스 개발 등과 같은 뚜렷한 고위험 프로필을 나타냅니다. 이러한 결과는 현대 에이전트 시스템의 보안이 기본 모델의 안전성 속성뿐만 아니라 모델 기능, 도구 사용, 다단계 계획 및 런타임 오케스트레이션 간의 결합에 의해 형성된다는 것을 시사합니다. 또한, 에이전트가 실행 기능을 부여받고 지속적인 런타임 컨텍스트를 갖게 되면, 초기 단계에서 발생하는 취약점이 구체적인 시스템 수준의 오류로 증폭될 수 있음을 보여줍니다. 전반적으로, 본 연구는 지능형 에이전트 프레임워크에 대한 프롬프트 수준의 보호 장치를 넘어, 전체 라이프사이클에 걸친 보안 거버넌스를 구축해야 할 필요성을 강조합니다.

Original Abstract

Tool-augmented AI agents substantially extend the practical capabilities of large language models, but they also introduce security risks that cannot be identified through model-only evaluation. In this paper, we present a systematic security assessment of six representative OpenClaw-series agent frameworks, namely OpenClaw, AutoClaw, QClaw, KimiClaw, MaxClaw, and ArkClaw, under multiple backbone models. To support this study, we construct a benchmark of 205 test cases covering representative attack behaviors across the full agent execution lifecycle, enabling unified evaluation of risk exposure at both the framework and model levels. Our results show that all evaluated agents exhibit substantial security vulnerabilities, and that agentized systems are significantly riskier than their underlying models used in isolation. In particular, reconnaissance and discovery behaviors emerge as the most common weaknesses, while different frameworks expose distinct high-risk profiles, including credential leakage, lateral movement, privilege escalation, and resource development. These findings indicate that the security of modern agent systems is shaped not only by the safety properties of the backbone model, but also by the coupling among model capability, tool use, multi-step planning, and runtime orchestration. We further show that once an agent is granted execution capability and persistent runtime context, weaknesses arising in early stages can be amplified into concrete system-level failures. Overall, our study highlights the need to move beyond prompt-level safeguards toward lifecycle-wide security governance for intelligent agent frameworks.

6 Citations

2 Influential

5 Altmetric

35.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!