2603.24203v1 Mar 25, 2026 cs.CR

모델 컨텍스트 프로토콜에서 발생하는 숨겨진 위협: 트리 기반 적응형 검색을 통한 은밀한 공격 페이로드 생성

Invisible Threats from Model Context Protocol: Generating Stealthy Injection Payload via Tree-based Adaptive Search

Geng Hong

Citations: 364

h-index: 8

Xu Pan

Citations: 127

h-index: 6

Min Yang

Citations: 128

h-index: 6

Yulin Shen

Citations: 18

h-index: 3

최근 모델 컨텍스트 프로토콜(MCP)의 발전으로 인해 대규모 언어 모델(LLM)이 외부 도구를 이전과는 비교할 수 없을 정도로 쉽게 사용할 수 있게 되었습니다. 이는 강력하고 도구 기반 에이전트의 새로운 범주를 만들어냅니다. 불행히도, 이러한 기능은 도구 응답의 악의적인 조작과 같이 아직 충분히 연구되지 않은 공격 경로를 야기합니다. MCP를 대상으로 하는 간접 프롬프트 주입 공격은 높은 구현 비용, 취약한 의미적 일관성 또는 높은 수준의 내부 정보 요구 사항을 가지고 있습니다. 또한, 이러한 공격은 최근에 제안된 방어 기술에 의해 쉽게 탐지될 수 있습니다. 본 논문에서는 Tree structured Injection for Payloads (TIP)라는 새로운 블랙박스 공격 방법을 제안합니다. TIP은 방어 기술이 적용된 환경에서도 MCP 기반 에이전트의 제어를 안정적으로 확보할 수 있는 자연스러운 페이로드를 생성합니다. 기술적으로, 우리는 페이로드 생성을 트리 구조의 검색 문제로 정의하고, 제안하는 조잡한-부터-세밀한 최적화 프레임워크를 기반으로 작동하는 공격 LLM을 사용하여 검색을 안내합니다. 학습을 안정화하고 지역 최적점에 빠지는 것을 방지하기 위해, 우리는 공격 모델에 고품질의 과거 경로만 제공하는 경로 인식 피드백 메커니즘을 도입합니다. 또한, 이 프레임워크는 관찰 가능한 방어 신호에 대한 명시적인 조건부 설정 및 탐색 예산의 동적 재할당을 통해 방어 변환에 더욱 강하게 만들어졌습니다. 네 가지 주요 LLM에 대한 광범위한 실험 결과, TIP은 방어되지 않은 환경에서 95% 이상의 공격 성공률을 달성하며, 기존의 적응형 공격보다 훨씬 적은 쿼리 횟수로 목표를 달성합니다. 네 가지 대표적인 방어 기술에 대한 공격 성능을 평가한 결과, TIP은 50% 이상의 효과를 유지하며, 최첨단 공격보다 훨씬 뛰어난 성능을 보였습니다. 실제 MCP 시스템에 대한 공격 실험을 통해, 우리는 MCP 배포에서 간과되었지만 실질적인 위협 요소를 발견했습니다. 또한, 이 중요한 보안 취약점을 해결하기 위한 잠재적인 완화 방안에 대해 논의합니다.

Original Abstract

Recent advances in the Model Context Protocol (MCP) have enabled large language models (LLMs) to invoke external tools with unprecedented ease. This creates a new class of powerful and tool augmented agents. Unfortunately, this capability also introduces an under explored attack surface, specifically the malicious manipulation of tool responses. Existing techniques for indirect prompt injection that target MCP suffer from high deployment costs, weak semantic coherence, or heavy white box requirements. Furthermore, they are often easily detected by recently proposed defenses. In this paper, we propose Tree structured Injection for Payloads (TIP), a novel black-box attack which generates natural payloads to reliably seize control of MCP enabled agents even under defense. Technically, We cast payload generation as a tree structured search problem and guide the search with an attacker LLM operating under our proposed coarse-to-fine optimization framework. To stabilize learning and avoid local optima, we introduce a path-aware feedback mechanism that surfaces only high quality historical trajectories to the attacker model. The framework is further hardened against defensive transformations by explicitly conditioning the search on observable defense signals and dynamically reallocating the exploration budget. Extensive experiments on four mainstream LLMs show that TIP attains over 95% attack success in undefended settings while requiring an order of magnitude fewer queries than prior adaptive attacks. Against four representative defense approaches, TIP preserves more than 50% effectiveness and significantly outperforms the state-of-the-art attacks. By implementing the attack on real world MCP systems, our results expose an invisible but practical threat vector in MCP deployments. We also discuss potential mitigation approaches to address this critical security gap.

2 Citations

0 Influential

4 Altmetric

22.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!