2602.20720v1 Feb 24, 2026 cs.CR

AdapTools: 에이전트형 LLM에 대한 적응형 도구 기반 간접 프롬프트 주입 공격

AdapTools: Adaptive Tool-based Indirect Prompt Injection Attacks on Agentic LLMs

Che Wang

Citations: 14

h-index: 2

Jiaming Zhang

Citations: 16

h-index: 3

Ziqi Zhang

Citations: 56

h-index: 3

Zijie J. Wang

Georgia Tech

Citations: 4,367

h-index: 22

Tao Wei

Citations: 33

h-index: 3

Wei Yang Bryan Lim

Citations: 71

h-index: 4

Yinghui Wang

Citations: 129

h-index: 3

Jianbo Gao

Citations: 442

h-index: 11

Zhong Chen

Citations: 146

h-index: 6

외부 데이터 서비스(예: 모델 컨텍스트 프로토콜, MCP)의 통합은 대규모 언어 모델 기반 에이전트의 성능을 향상시켜 복잡한 작업 수행 능력을 높였습니다. 그러나 이러한 발전은 중요한 보안 취약점을 야기하며, 특히 간접 프롬프트 주입(IPI) 공격의 위험을 증가시킵니다. 기존 공격 방법은 정적 패턴에 의존하고 단순한 언어 모델에 대한 평가에 국한되어, 현대 AI 에이전트의 빠르게 변화하는 특성을 제대로 반영하지 못합니다. 본 논문에서는 AdapTools라는 새로운 적응형 IPI 공격 프레임워크를 소개합니다. AdapTools는 더욱 은밀한 공격 도구를 선택하고, 적응적인 공격 프롬프트를 생성하여 엄격한 보안 평가 환경을 구축합니다. 우리 접근 방식은 크게 두 가지 주요 구성 요소로 이루어집니다. (1) 적응형 공격 전략 구성: 프롬프트 최적화를 위한 전이 가능한 적대적 전략을 개발합니다. (2) 공격 강화: 작업 관련 방어를 우회할 수 있는 은밀한 도구를 식별합니다. 종합적인 실험 결과는 AdapTools가 공격 성공률을 2.13배 향상시키는 동시에 시스템 유용성을 1.78배 저하시킴을 보여줍니다. 주목할 만한 점은 이 프레임워크가 최첨단 방어 메커니즘에 직면하더라도 효과를 유지한다는 것입니다. 본 연구는 IPI 공격에 대한 이해를 높이고 향후 연구를 위한 유용한 참고 자료를 제공합니다.

Original Abstract

The integration of external data services (e.g., Model Context Protocol, MCP) has made large language model-based agents increasingly powerful for complex task execution. However, this advancement introduces critical security vulnerabilities, particularly indirect prompt injection (IPI) attacks. Existing attack methods are limited by their reliance on static patterns and evaluation on simple language models, failing to address the fast-evolving nature of modern AI agents. We introduce AdapTools, a novel adaptive IPI attack framework that selects stealthier attack tools and generates adaptive attack prompts to create a rigorous security evaluation environment. Our approach comprises two key components: (1) Adaptive Attack Strategy Construction, which develops transferable adversarial strategies for prompt optimization, and (2) Attack Enhancement, which identifies stealthy tools capable of circumventing task-relevance defenses. Comprehensive experimental evaluation shows that AdapTools achieves a 2.13 times improvement in attack success rate while degrading system utility by a factor of 1.78. Notably, the framework maintains its effectiveness even against state-of-the-art defense mechanisms. Our method advances the understanding of IPI attacks and provides a useful reference for future research.

6 Citations

0 Influential

11 Altmetric

61.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!