2604.18463v1 Apr 20, 2026 cs.AI

대규모 언어 모델을 활용한 로봇 계획에서 발생하는 체계적인 안전 위험

Using large language models for embodied planning introduces systematic safety risks

Fan Shi

Citations: 63

h-index: 3

Manling Li

Citations: 651

h-index: 9

Kaixian Qu

Citations: 66

h-index: 5

Zhibin Li

Citations: 127

h-index: 6

Jiajun Wu

Citations: 107

h-index: 3

Marco Hutter

Citations: 138

h-index: 6

Tao Zhang

Citations: 27

h-index: 3

대규모 언어 모델은 로봇 시스템의 계획 도구로 점점 더 많이 사용되고 있지만, 이러한 모델이 얼마나 안전하게 계획을 수립하는지는 여전히 중요한 연구 과제입니다. 체계적으로 안전성을 평가하기 위해, 우리는 12,279개의 작업으로 구성된 벤치마크인 DESPITE를 소개합니다. 이 벤치마크는 물리적 위험과 규범적 위험을 모두 포함하며, 완전한 결정론적 검증을 제공합니다. 23개의 모델을 분석한 결과, 거의 완벽한 계획 능력을 가진 모델이라 할지라도 안전을 보장하지 못했습니다. 가장 뛰어난 계획 능력을 가진 모델은 전체 작업의 0.4%에서만 유효한 계획을 생성하지 못했지만, 28.3%의 작업에서 위험한 계획을 생성했습니다. 30억에서 671억 개의 파라미터를 가진 18개의 오픈 소스 모델을 분석한 결과, 모델 크기가 커짐에 따라 계획 능력은 크게 향상되었습니다(0.4%에서 99.3%). 반면, 안전 인식은 상대적으로 정체되어 있었습니다(38%에서 57%). 우리는 이러한 두 가지 능력 간에 곱셈 관계가 있음을 확인했으며, 더 큰 모델은 주로 계획 능력 향상을 통해 더 많은 작업을 안전하게 완료하는 것으로 나타났으며, 위험 회피 능력의 향상이 주요 요인이 아닙니다. 세 개의 독점적인 추론 모델은 현저히 높은 안전 인식 수준(71%에서 81%)을 보인 반면, 비추론 독점 모델과 오픈 소스 추론 모델은 모두 57% 미만의 안전 인식 수준을 보였습니다. 최첨단 모델의 계획 능력이 포화 상태에 가까워짐에 따라, 로봇 시스템에 언어 모델 기반 계획 도구를 배포하기 위해서는 안전 인식 개선이 핵심 과제가 될 것입니다.

Original Abstract

Large language models are increasingly used as planners for robotic systems, yet how safely they plan remains an open question. To evaluate safe planning systematically, we introduce DESPITE, a benchmark of 12,279 tasks spanning physical and normative dangers with fully deterministic validation. Across 23 models, even near-perfect planning ability does not ensure safety: the best-planning model fails to produce a valid plan on only 0.4% of tasks but produces dangerous plans on 28.3%. Among 18 open-source models from 3B to 671B parameters, planning ability improves substantially with scale (0.4-99.3%) while safety awareness remains relatively flat (38-57%). We identify a multiplicative relationship between these two capacities, showing that larger models complete more tasks safely primarily through improved planning, not through better danger avoidance. Three proprietary reasoning models reach notably higher safety awareness (71-81%), while non-reasoning proprietary models and open-source reasoning models remain below 57%. As planning ability approaches saturation for frontier models, improving safety awareness becomes a central challenge for deploying language-model planners in robotic systems.

0 Citations

0 Influential

4.5 Altmetric

22.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!