2603.02983v1 Mar 03, 2026 cs.CR

LLM 에이전트를 위한 상황 인식 개인 정보 보호

Contextualized Privacy Defense for LLM Agents

Yanzhe Zhang

Georgia Institute of Technology

Citations: 1,789

h-index: 16

Jianxun Lian

Citations: 25

h-index: 3

Xiaoyuan Yi

Citations: 480

h-index: 12

Xing Xie

Citations: 119

h-index: 4

Diyi Yang

Citations: 1,386

h-index: 11

Yule Wen

Citations: 6

h-index: 1

LLM 에이전트가 사용자의 개인 정보를 처리하는 경우가 점점 늘어나고 있지만, 기존의 개인 정보 보호 기술은 설계 및 적응성 측면에서 한계가 있습니다. 대부분의 기존 방법은 프롬프트 및 가드와 같은 정적 또는 수동적인 방어 기술에 의존합니다. 이러한 방식은 다단계 에이전트 실행 과정에서 상황에 맞는 능동적인 개인 정보 보호 결정을 지원하기에는 부족합니다. 본 연구에서는 상황 인식 방어 지침(Contextualized Defense Instructing, CDI)이라는 새로운 개인 정보 보호 패러다임을 제안합니다. CDI는 실행 중에 특정 단계 및 상황에 맞는 개인 정보 보호 지침을 생성하는 모델을 활용하여, 단순히 제약을 가하거나 거부하는 것이 아니라, 능동적으로 에이전트의 행동을 조정합니다. 특히, CDI는 경험 기반 최적화 프레임워크와 함께 사용되며, 이 프레임워크는 강화 학습(Reinforcement Learning, RL)을 통해 모델을 훈련합니다. 여기서 개인 정보 침해를 포함하는 실패 경로를 학습 환경으로 변환합니다. 본 연구에서는 기본적인 방어 기술과 CDI를 표준 에이전트 루프 내의 별개의 개입 지점으로 공식화하고, 통합 시뮬레이션 프레임워크 내에서 이들의 개인 정보 보호 유용성 균형을 비교합니다. 결과는 CDI가 기준 모델보다 개인 정보 보호(94.2%)와 유용성(80.6%) 측면에서 더 나은 균형을 이루며, 적대적인 환경에 대한 견고성 및 일반화 능력도 우수함을 보여줍니다.

Original Abstract

LLM agents increasingly act on users' personal information, yet existing privacy defenses remain limited in both design and adaptability. Most prior approaches rely on static or passive defenses, such as prompting and guarding. These paradigms are insufficient for supporting contextual, proactive privacy decisions in multi-step agent execution. We propose Contextualized Defense Instructing (CDI), a new privacy defense paradigm in which an instructor model generates step-specific, context-aware privacy guidance during execution, proactively shaping actions rather than merely constraining or vetoing them. Crucially, CDI is paired with an experience-driven optimization framework that trains the instructor via reinforcement learning (RL), where we convert failure trajectories with privacy violations into learning environments. We formalize baseline defenses and CDI as distinct intervention points in a canonical agent loop, and compare their privacy-helpfulness trade-offs within a unified simulation framework. Results show that our CDI consistently achieves a better balance between privacy preservation (94.2%) and helpfulness (80.6%) than baselines, with superior robustness to adversarial conditions and generalization.

1 Citations

0 Influential

8 Altmetric

41.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!