2601.18842v2 Jan 26, 2026 cs.CR

GUIGuard: 개인 정보 보호 GUI 에이전트를 위한 일반적인 프레임워크

GUIGuard: Toward a General Framework for Privacy-Preserving GUI Agents

Wenbo Zhou

Citations: 3,628

h-index: 24

Yanxi Wang

Citations: 16

h-index: 2

Zhiling Zhang

Citations: 10

h-index: 2

Weiming Zhang

Citations: 59

h-index: 5

Jie Zhang

Citations: 7

h-index: 2

Qiannan Zhu

Citations: 16

h-index: 2

Yu Shi

Citations: 9

h-index: 2

Shuxin Zheng

Citations: 68

h-index: 4

Jiyan He

University of Science and Technology of China

Citations: 492

h-index: 8

GUI 에이전트는 화면 인터페이스를 직접 인식하고 상호 작용하여 엔드 투 엔드 자동화를 가능하게 합니다. 그러나 이러한 에이전트는 종종 민감한 개인 정보가 포함된 인터페이스에 접근하며, 스크린샷이 종종 원격 모델로 전송되어 상당한 개인 정보 침해 위험을 초래합니다. 이러한 위험은 특히 GUI 워크플로우에서 심각합니다. GUI는 더욱 풍부하고 접근 가능한 개인 정보를 노출하며, 개인 정보 침해 위험은 순차적인 장면에서의 상호 작용 경로에 따라 달라집니다. 본 논문에서는 개인 정보 보호 GUI 에이전트를 위한 세 단계 프레임워크인 GUIGuard를 제안합니다: (1) 개인 정보 인식, (2) 개인 정보 보호, (3) 보호 상태에서의 작업 실행. 또한, GUIGuard-Bench라는 630개의 트래jectory와 13,830개의 스크린샷으로 구성된 크로스 플랫폼 벤치마크를 구축했습니다. 이 벤치마크는 영역 수준의 개인 정보 정보 및 위험 수준, 개인 정보 카테고리, 작업 필요성에 대한 세분화된 레이블로 주석 처리되어 있습니다. 실험 결과, 기존 에이전트는 제한적인 개인 정보 인식 능력을 보이는 것으로 나타났습니다. 최첨단 모델은 Android에서 13.3%의 정확도를, PC에서 1.4%의 정확도를 달성하는 데 그쳤습니다. 개인 정보 보호 상태에서 작업 계획의 의미를 유지할 수 있으며, 독점 모델은 오픈 소스 모델보다 더 강한 의미 일관성을 보이는 것으로 나타났습니다. MobileWorld에 대한 사례 연구 결과, 신중하게 설계된 보호 전략은 개인 정보를 보호하면서 더 높은 작업 정확도를 달성할 수 있음을 보여줍니다. 본 연구 결과는 개인 정보 인식 능력이 실용적인 GUI 에이전트를 위한 중요한 병목 현상임을 강조합니다. 프로젝트 정보: https://futuresis.github.io/GUIGuard-page/

Original Abstract

GUI agents enable end-to-end automation through direct perception of and interaction with on-screen interfaces. However, these agents frequently access interfaces containing sensitive personal information, and screenshots are often transmitted to remote models, creating substantial privacy risks. These risks are particularly severe in GUI workflows: GUIs expose richer, more accessible private information, and privacy risks depend on interaction trajectories across sequential scenes. We propose GUIGuard, a three-stage framework for privacy-preserving GUI agents: (1) privacy recognition, (2) privacy protection, and (3) task execution under protection. We further construct GUIGuard-Bench, a cross-platform benchmark with 630 trajectories and 13,830 screenshots, annotated with region-level privacy grounding and fine-grained labels of risk level, privacy category, and task necessity. Evaluations reveal that existing agents exhibit limited privacy recognition, with state-of-the-art models achieving only 13.3% accuracy on Android and 1.4% on PC. Under privacy protection, task-planning semantics can still be maintained, with closed-source models showing stronger semantic consistency than open-source ones. Case studies on MobileWorld show that carefully designed protection strategies achieve higher task accuracy while preserving privacy. Our results highlight privacy recognition as a critical bottleneck for practical GUI agents. Project: https://futuresis.github.io/GUIGuard-page/

4 Citations

0 Influential

12 Altmetric

64.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!