2604.01350v1 Apr 01, 2026 cs.CL

공격자가 필요 없는: 공유 상태 LLM 에이전트 시스템에서의 의도치 않은 사용자 간 오염

No Attacker Needed: Unintentional Cross-User Contamination in Shared-State LLM Agents

Ruiyao Xu

Citations: 73

h-index: 3

Yi Nian

Citations: 113

h-index: 6

Ryan A. Rossi

Citations: 140

h-index: 6

Tiankai Yang

University of Southern California

Citations: 150

h-index: 7

Jiate Li

Citations: 6

h-index: 1

Shen Dong

Citations: 115

h-index: 3

Kaize Ding

Citations: 137

h-index: 6

Yue Zhao

Citations: 24

h-index: 3

LLM 기반 에이전트는 점점 더 많은 경우 반복적인 세션을 통해 운영되며, 작업 상태를 유지하여 연속성을 보장합니다. 많은 경우, 하나의 에이전트가 팀 또는 조직 내의 여러 사용자를 대상으로 서비스를 제공하며, 사용자 ID를 넘어 공유된 지식 레이어를 재사용합니다. 이러한 공유 지속성은 오류 발생 가능성을 확대합니다. 즉, 특정 사용자에게는 유효한 정보가 에이전트가 범위에 대한 고려 없이 재사용할 때, 다른 사용자의 결과에 영향을 미쳐 성능을 저하시킬 수 있습니다. 우리는 이러한 오류 방식을 '의도치 않은 사용자 간 오염(UCC)'이라고 부릅니다. 적대적인 메모리 오염과는 달리, UCC는 공격자를 필요로 하지 않으며, 범위에 묶인 정보가 지속되고 나중에 잘못 적용되어 발생하는 정상적인 상호 작용에서 비롯됩니다. 우리는 제어된 평가 프로토콜을 통해 UCC를 공식화하고, 세 가지 유형의 오염에 대한 분류 체계를 제시하며, 두 가지 공유 상태 메커니즘에서 이 문제를 평가했습니다. 공유 상태가 raw 상태일 때, 정상적인 상호 작용만으로도 57~71%의 오염 발생률을 보였습니다. 대화형 공유 상태에서는 쓰기 시의 정제(sanitization)가 효과적이지만, 공유 상태에 실행 가능한 파일과 같은 요소가 포함될 경우에는 상당한 잔여 위험이 있으며, 오염은 종종 눈에 띄지 않는 잘못된 답변으로 나타납니다. 이러한 결과는 공유 상태 에이전트가 텍스트 수준의 정제뿐만 아니라 파일 수준의 방어가 필요하며, 이를 통해 사용자가 인지하지 못하는 사용자 간 오류를 방지할 수 있음을 시사합니다.

Original Abstract

LLM-based agents increasingly operate across repeated sessions, maintaining task states to ensure continuity. In many deployments, a single agent serves multiple users within a team or organization, reusing a shared knowledge layer across user identities. This shared persistence expands the failure surface: information that is locally valid for one user can silently degrade another user's outcome when the agent reapplies it without regard for scope. We refer to this failure mode as unintentional cross-user contamination (UCC). Unlike adversarial memory poisoning, UCC requires no attacker; it arises from benign interactions whose scope-bound artifacts persist and are later misapplied. We formalize UCC through a controlled evaluation protocol, introduce a taxonomy of three contamination types, and evaluate the problem in two shared-state mechanisms. Under raw shared state, benign interactions alone produce contamination rates of 57--71%. A write-time sanitization is effective when shared state is conversational, but leaves substantial residual risk when shared state includes executable artifacts, with contamination often manifesting as silent wrong answers. These results indicate that shared-state agents need artifact-level defenses beyond text-level sanitization to prevent silent cross-user failures.

1 Citations

0 Influential

3.5 Altmetric

18.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!