2603.21523v1 Mar 23, 2026 cs.RO

SafePilot: LLM 기반 사이버 물리 시스템의 안전성 확보를 위한 프레임워크

SafePilot: A Framework for Assuring LLM-enabled Cyber-Physical Systems

Weizhe Xu

Citations: 13

h-index: 1

Fanxin Kong

Citations: 68

h-index: 5

Mengyu Liu

Citations: 98

h-index: 7

최근 대규모 언어 모델(LLM)은 로봇, 산업 자동화, 자동 조종 시스템과 같은 다양한 사이버 물리 시스템(CPS)에 통합되기 시작했습니다. LLM의 추상적인 지식과 추론 능력은 계획 및 내비게이션과 같은 작업에 활용됩니다. 그러나 LLM의 주요 과제 중 하나는 "환각" 현상, 즉 일관성은 있지만 사실과 다르거나 맥락에 맞지 않는 출력을 생성하는 경향입니다. 이러한 특성은 CPS에서 바람직하지 않거나 위험한 행동으로 이어질 수 있습니다. 따라서 본 연구는 LLM 기반 CPS의 중요한 특성을 강화하여 안전성을 확보하는 데 중점을 둡니다. 본 연구에서는 속성 기반 및 시간 제약 조건을 준수하는 LLM 기반 CPS에 대한 엔드 투 엔드 안전성 확보 기능을 제공하는 새로운 계층적 신경-기호 프레임워크인 SafePilot을 제안합니다. SafePilot은 주어진 작업과 그 사양에 따라 먼저 복잡성을 평가하는 판별기를 갖춘 계층적 계획자를 호출합니다. 작업이 관리 가능한 것으로 판단되면, 내장된 검증 기능을 갖춘 LLM 기반 작업 계획자로 직접 전달합니다. 그렇지 않으면 계층적 계획자는 분할 정복 전략을 적용하여 작업을 하위 작업으로 분해하고, 각 하위 작업을 개별적으로 계획한 후 최종 솔루션으로 병합합니다. LLM 기반 작업 계획자는 자연어 제약을 형식 사양으로 변환하고 LLM의 출력 결과를 해당 사양과 비교하여 검증합니다. 위반 사항이 발견되면, 오류를 식별하고 프롬프트를 조정하여 LLM을 다시 호출합니다. 이 반복적인 프로세스는 유효한 계획이 생성되거나 미리 정의된 제한에 도달할 때까지 계속됩니다. 본 프레임워크는 속성 기반 및 시간 제약 조건 모두를 지원하는 LLM 기반 CPS에 적용될 수 있습니다. 효과성과 적응성은 두 가지 예시 연구를 통해 입증되었습니다.

Original Abstract

Large Language Models (LLMs), deep learning architectures with typically over 10 billion parameters, have recently begun to be integrated into various cyber-physical systems (CPS) such as robotics, industrial automation, and autopilot systems. The abstract knowledge and reasoning capabilities of LLMs are employed for tasks like planning and navigation. However, a significant challenge arises from the tendency of LLMs to produce "hallucinations" - outputs that are coherent yet factually incorrect or contextually unsuitable. This characteristic can lead to undesirable or unsafe actions in the CPS. Therefore, our research focuses on assuring the LLM-enabled CPS by enhancing their critical properties. We propose SafePilot, a novel hierarchical neuro-symbolic framework that provides end-to-end assurance for LLM-enabled CPS according to attribute-based and temporal specifications. Given a task and its specification, SafePilot first invokes a hierarchical planner with a discriminator that assesses task complexity. If the task is deemed manageable, it is passed directly to an LLM-based task planner with built-in verification. Otherwise, the hierarchical planner applies a divide-and-conquer strategy, decomposing the task into sub-tasks, each of which is individually planned and later merged into a final solution. The LLM-based task planner translates natural language constraints into formal specifications and verifies the LLM's output against them. If violations are detected, it identifies the flaw, adjusts the prompt accordingly, and re-invokes the LLM. This iterative process continues until a valid plan is produced or a predefined limit is reached. Our framework supports LLM-enabled CPS with both attribute-based and temporal constraints. Its effectiveness and adaptability are demonstrated through two illustrative case studies.

0 Citations

0 Influential

3.5 Altmetric

17.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!