2602.13516v1 Feb 13, 2026 cs.AI

SPILLage: 웹 상에서의 자율적인 과도한 정보 공유

SPILLage: Agentic Oversharing on the Web

Eugene Bagdasarian

Citations: 54

h-index: 4

Jaechul Roh

Citations: 152

h-index: 6

Hamed Haddadi

Citations: 52

h-index: 4

A. Shamsabadi

Citations: 1,674

h-index: 21

LLM(대규모 언어 모델) 기반 에이전트는 사용자의 작업을 자동화하며, 종종 이메일과 캘린더와 같은 사용자 리소스에 접근합니다. 기존의 LLM이 제한된 ChatBot 환경에서 질문에 답변하는 것과는 달리, 웹 에이전트는 실제 환경에서 다양한 서비스와 상호 작용하며 사용자의 활동 흔적을 남깁니다. 본 연구에서는 웹 에이전트가 사용자를 대신하여 웹사이트에서 작업을 수행할 때 사용자 리소스를 어떻게 처리하는지 질문합니다. 본 논문에서는 자연스러운 자율적인 과도한 정보 공유(Natural Agentic Oversharing)를 정의합니다. 이는 에이전트의 웹 활동 흔적을 통해 의도치 않게 작업과 관련 없는 사용자 정보를 노출하는 현상입니다. 우리는 SPILLage라는 프레임워크를 소개하며, 이는 정보 공유를 채널(콘텐츠 vs. 행동)과 직접성(명시적 vs. 암시적)이라는 두 가지 측면으로 분류합니다. 이러한 분류는 중요한 간과된 측면을 드러냅니다. 기존 연구는 주로 텍스트 유출에 초점을 맞추었지만, 웹 에이전트는 클릭, 스크롤, 탐색 패턴과 같은 행동적인 측면에서도 과도한 정보를 공유하며, 이는 모니터링될 수 있습니다. 우리는 실제 전자 상거래 사이트에서 180개의 작업을 수행하고, 작업과 관련된 속성과 작업과 관련 없는 속성을 구분하여 Ground-truth 데이터를 구축했습니다. 두 가지 에이전트 프레임워크와 세 가지 LLM을 사용하여 총 1,080번의 실행을 통해, 과도한 정보 공유가 널리 발생하며, 콘텐츠 공유보다 행동 공유가 5배 더 빈번하게 발생한다는 것을 확인했습니다. 프롬프트 수준의 완화 노력에도 불구하고 이러한 현상은 지속되거나 심화될 수 있습니다. 그러나 작업 실행 전에 작업과 관련 없는 정보를 제거하면 작업 성공률을 최대 17.9%까지 향상시킬 수 있으며, 이는 과도한 정보 공유를 줄이면 작업 성공률이 향상됨을 보여줍니다. 본 연구 결과는 웹 에이전트에서 개인 정보 보호가 근본적인 과제이며, 에이전트가 웹에서 수행하는 행동, 즉 단순한 텍스트 출력뿐만 아니라 전체적인 활동을 고려하는 포괄적인 접근 방식이 필요하다는 것을 강조합니다. 본 연구의 데이터셋 및 코드는 https://github.com/jrohsc/SPILLage 에서 확인할 수 있습니다.

Original Abstract

LLM-powered agents are beginning to automate user's tasks across the open web, often with access to user resources such as emails and calendars. Unlike standard LLMs answering questions in a controlled ChatBot setting, web agents act "in the wild", interacting with third parties and leaving behind an action trace. Therefore, we ask the question: how do web agents handle user resources when accomplishing tasks on their behalf across live websites? In this paper, we formalize Natural Agentic Oversharing -- the unintentional disclosure of task-irrelevant user information through an agent trace of actions on the web. We introduce SPILLage, a framework that characterizes oversharing along two dimensions: channel (content vs. behavior) and directness (explicit vs. implicit). This taxonomy reveals a critical blind spot: while prior work focuses on text leakage, web agents also overshare behaviorally through clicks, scrolls, and navigation patterns that can be monitored. We benchmark 180 tasks on live e-commerce sites with ground-truth annotations separating task-relevant from task-irrelevant attributes. Across 1,080 runs spanning two agentic frameworks and three backbone LLMs, we demonstrate that oversharing is pervasive with behavioral oversharing dominates content oversharing by 5x. This effect persists -- and can even worsen -- under prompt-level mitigation. However, removing task-irrelevant information before execution improves task success by up to 17.9%, demonstrating that reducing oversharing improves task success. Our findings underscore that protecting privacy in web agents is a fundamental challenge, requiring a broader view of "output" that accounts for what agents do on the web, not just what they type. Our datasets and code are available at https://github.com/jrohsc/SPILLage.

2 Citations

1 Influential

30.5 Altmetric

156.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!