2603.30016v1 Mar 31, 2026 cs.CR

안전한 AI 에이전트 설계: 간접 프롬프트 주입 공격에 대한 시스템 수준 방어 관점

Architecting Secure AI Agents: Perspectives on System-Level Defenses Against Indirect Prompt Injection Attacks

Chaowei Xiao

Citations: 17

h-index: 2

Hanshen Xiao

Citations: 3

h-index: 1

G. E. Suh

Citations: 3

h-index: 1

Sanjay Kariyappa

Citations: 45

h-index: 4

Chong Xiang

Citations: 7

h-index: 1

Drew Zagieboylo

Citations: 67

h-index: 4

Shaona Ghosh

Citations: 551

h-index: 10

Kai Greshake

Citations: 1,260

h-index: 3

대부분의 AI 에이전트는 대규모 언어 모델(LLM)에 의해 구동되지만, 신뢰할 수 없는 데이터에 내장된 악성 명령으로 인해 위험한 에이전트 동작을 유발하는 간접 프롬프트 주입 공격에 취약합니다. 본 논문에서는 간접 프롬프트 주입 공격에 대한 시스템 수준 방어에 대한 우리의 비전을 제시합니다. 우리는 세 가지 핵심 주장을 제시합니다: (1) 동적 작업과 현실적인 환경에서는 동적 재계획 및 보안 정책 업데이트가 종종 필요합니다. (2) 특정 상황 의존적인 보안 결정은 여전히 LLM(또는 다른 학습 모델)을 필요로 하지만, 모델이 관찰하고 결정할 수 있는 내용을 엄격하게 제한하는 시스템 설계 내에서만 수행되어야 합니다. (3) 본질적으로 모호한 경우, 개인화 및 인간 상호 작용은 핵심 설계 고려 사항으로 간주되어야 합니다. 또한, 본 논문에서는 기존 벤치마크의 한계를 논의하며, 이러한 벤치마크가 유용성과 보안에 대한 잘못된 인식을 심어줄 수 있습니다. 또한, 시스템 수준 방어가 에이전트 시스템의 기본 구조를 형성하여 에이전트 동작을 구조화하고 제어하며, 규칙 기반 및 모델 기반 보안 검사를 통합하고, 모델의 견고성 및 인간 상호 작용에 대한 보다 타겟팅된 연구를 가능하게 하는 데 중요한 역할을 한다는 점을 강조합니다.

Original Abstract

AI agents, predominantly powered by large language models (LLMs), are vulnerable to indirect prompt injection, in which malicious instructions embedded in untrusted data can trigger dangerous agent actions. This position paper discusses our vision for system-level defenses against indirect prompt injection attacks. We articulate three positions: (1) dynamic replanning and security policy updates are often necessary for dynamic tasks and realistic environments; (2) certain context-dependent security decisions would still require LLMs (or other learned models), but should only be made within system designs that strictly constrain what the model can observe and decide; (3) in inherently ambiguous cases, personalization and human interaction should be treated as core design considerations. In addition to our main positions, we discuss limitations of existing benchmarks that can create a false sense of utility and security. We also highlight the value of system-level defenses, which serve as the skeleton of agentic systems by structuring and controlling agent behaviors, integrating rule-based and model-based security checks, and enabling more targeted research on model robustness and human interaction.

1 Citations

0 Influential

5 Altmetric

26.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!