2602.21127v1 Feb 24, 2026 cs.HC

확실하신가요?: LLM 기반 에이전트 시스템에서 인간의 인지 취약성에 대한 실증 연구

"Are You Sure?": An Empirical Study of Human Perception Vulnerability in LLM-Driven Agentic Systems

Gelei Deng

Citations: 80

h-index: 6

Shenyu Dai

Citations: 38

h-index: 2

Kelong Zheng

Citations: 13

h-index: 2

Yue Xiao

Citations: 208

h-index: 7

Wei Dong

Citations: 31

h-index: 3

Xiaofeng Wang

Citations: 54

h-index: 4

Xinfeng Li

Citations: 313

h-index: 10

대규모 언어 모델(LLM) 기반 에이전트는 소프트웨어 개발 및 의료 분야와 같은 고위험 영역에서 신뢰할 수 있는 협력 도구로 빠르게 자리 잡고 있습니다. 그러나 이러한 신뢰 심화는 새로운 공격 경로인 '에이전트 매개 속임수(AMD)'를 야기합니다. AMD는 손상된 에이전트가 인간 사용자를 공격하는 데 악용되는 방식입니다. 기존 연구는 주로 에이전트 자체의 위협에 초점을 맞추었지만, 손상된 에이전트에 의한 속임수에 대한 인간의 취약성은 아직 탐구되지 않았습니다. 본 연구에서는 303명의 참가자를 대상으로 AMD에 대한 인간의 취약성을 측정하는 최초의 대규모 실증 연구를 수행했습니다. 우리는 고정밀 연구 플랫폼인 HAT-Lab(Human-Agent Trust Laboratory)을 개발하여, 일상 및 전문 영역(예: 의료, 소프트웨어 개발, 인사)을 포괄하는 9가지 정교하게 설계된 시나리오를 제공했습니다. 10가지 주요 결과는 상당한 취약점을 드러내며 향후 방어 전략에 대한 시사점을 제공합니다. 구체적으로, 참가자의 8.6%만이 AMD 공격을 인지했으며, 특정 시나리오에서 전문가 그룹은 더 높은 취약성을 보였습니다. 우리는 사용자의 6가지 인지 오류 패턴을 식별하고, 위험 인식이 보호 행동으로 이어지지 않는다는 사실을 확인했습니다. 방어 분석 결과, 효과적인 경고는 낮은 검증 비용으로 워크플로우를 방해해야 합니다. HAT-Lab을 기반으로 한 경험적 학습을 통해, 위험을 인지한 사용자의 90% 이상이 AMD에 대한 경계를 강화한다고 보고했습니다. 본 연구는 인간 중심의 에이전트 보안 연구를 위한 실증적 증거와 플랫폼을 제공합니다.

Original Abstract

Large language model (LLM) agents are rapidly becoming trusted copilots in high-stakes domains like software development and healthcare. However, this deepening trust introduces a novel attack surface: Agent-Mediated Deception (AMD), where compromised agents are weaponized against their human users. While extensive research focuses on agent-centric threats, human susceptibility to deception by a compromised agent remains unexplored. We present the first large-scale empirical study with 303 participants to measure human susceptibility to AMD. This is based on HAT-Lab (Human-Agent Trust Laboratory), a high-fidelity research platform we develop, featuring nine carefully crafted scenarios spanning everyday and professional domains (e.g., healthcare, software development, human resources). Our 10 key findings reveal significant vulnerabilities and provide future defense perspectives. Specifically, only 8.6% of participants perceive AMD attacks, while domain experts show increased susceptibility in certain scenarios. We identify six cognitive failure modes in users and find that their risk awareness often fails to translate to protective behavior. The defense analysis reveals that effective warnings should interrupt workflows with low verification costs. With experiential learning based on HAT-Lab, over 90% of users who perceive risks report increased caution against AMD. This work provides empirical evidence and a platform for human-centric agent security research.

0 Citations

0 Influential

5 Altmetric

25.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!