2601.07263v1 Jan 12, 2026 cs.CR

봇이 미끼에 걸릴 때: 웹 자동화 에이전트에서 발생하는 새로운 사회 공학적 공격을 분석하고 완화

When Bots Take the Bait: Exposing and Mitigating the Emerging Social Engineering Attack in Web Automation Agent

Xinyi Wu

Citations: 10

h-index: 2

Geng Hong

Citations: 364

h-index: 8

Xu Pan

Citations: 127

h-index: 6

Jiarun Dai

Citations: 265

h-index: 8

Yueyue Chen

Citations: 6

h-index: 2

Mingxuan Liu

Citations: 32

h-index: 3

Feier Jin

Citations: 3

h-index: 1

Baojun Liu

Tsinghua University

Citations: 1,113

h-index: 21

대규모 언어 모델(LLM)에 의해 구동되는 웹 에이전트는 복잡한 웹 상호 작용을 자동화하는 데 점점 더 많이 사용되고 있습니다. 오픈 소스 프레임워크(예: Browser Use, Skyvern-AI)의 발전은 이러한 사용을 가속화했지만, 공격 표면을 넓히기도 했습니다. 기존 연구에서는 프롬프트 주입 및 백도어와 같은 모델 관련 위협에 초점을 맞추었지만, 사회 공학적 공격의 위험은 아직 충분히 연구되지 않았습니다. 본 연구에서는 웹 자동화 에이전트에 대한 사회 공학적 공격에 대한 최초의 체계적인 연구를 수행하고, 실행 시 플러그인 방식으로 적용 가능한 완화 솔루션을 설계했습니다. 공격 측면에서, 우리는 에이전트 실행의 내재적 약점을 악용하는 AgentBait 패러다임을 소개합니다. 유도된 컨텍스트는 에이전트의 추론을 왜곡하고 의도된 작업과 일치하지 않는 악의적인 목표를 향하게 할 수 있습니다. 방어 측면에서, 우리는 웹 페이지 컨텍스트와 의도된 목표 간의 환경 및 의도 일관성을 적용하여 실행 전에 위험한 작업을 완화하는 경량 런타임 모듈인 SUPERVISOR를 제안합니다. 실증적 결과에 따르면, 주류 프레임워크는 AgentBait 공격에 매우 취약하며, 평균 공격 성공률은 67.5%이며, 특정 전략(예: 신뢰할 수 있는 신원 위조) 하에서는 80%를 초과합니다. 기존의 경량 방어 메커니즘과 비교했을 때, 우리의 모듈은 다양한 웹 자동화 프레임워크에 원활하게 통합될 수 있으며, 평균적으로 공격 성공률을 최대 78.1%까지 줄이는 동시에 런타임 오버헤드는 7.7%에 불과하고 사용성을 유지합니다. 본 연구는 AgentBait를 웹 에이전트에 대한 중요한 새로운 위협으로 밝히고, 실용적이고 일반화 가능한 방어를 제시함으로써 이 빠르게 성장하는 생태계의 보안을 향상시킵니다. 본 연구에서 밝혀진 공격에 대한 상세 내용은 프레임워크 개발자에게 보고했으며, 제출 전에 확인을 받았습니다.

Original Abstract

Web agents, powered by large language models (LLMs), are increasingly deployed to automate complex web interactions. The rise of open-source frameworks (e.g., Browser Use, Skyvern-AI) has accelerated adoption, but also broadened the attack surface. While prior research has focused on model threats such as prompt injection and backdoors, the risks of social engineering remain largely unexplored. We present the first systematic study of social engineering attacks against web automation agents and design a pluggable runtime mitigation solution. On the attack side, we introduce the AgentBait paradigm, which exploits intrinsic weaknesses in agent execution: inducement contexts can distort the agent's reasoning and steer it toward malicious objectives misaligned with the intended task. On the defense side, we propose SUPERVISOR, a lightweight runtime module that enforces environment and intention consistency alignment between webpage context and intended goals to mitigate unsafe operations before execution. Empirical results show that mainstream frameworks are highly vulnerable to AgentBait, with an average attack success rate of 67.5% and peaks above 80% under specific strategies (e.g., trusted identity forgery). Compared with existing lightweight defenses, our module can be seamlessly integrated across different web automation frameworks and reduces attack success rates by up to 78.1% on average while incurring only a 7.7% runtime overhead and preserving usability. This work reveals AgentBait as a critical new threat surface for web agents and establishes a practical, generalizable defense, advancing the security of this rapidly emerging ecosystem. We reported the details of this attack to the framework developers and received acknowledgment before submission.

3 Citations

0 Influential

10.5 Altmetric

55.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!