2601.03294v1 Jan 05, 2026 cs.CR

AgentMark: 유틸리티를 유지하는 에이전트 행동 워터마킹

AgentMark: Utility-Preserving Behavioral Watermarking for Agents

Zhongliang Yang

Citations: 62

h-index: 5

Linna Zhou

Citations: 54

h-index: 5

Kaibo Huang

Citations: 23

h-index: 3

Jin Tan

Citations: 14

h-index: 2

Yukun Wei

Citations: 4

h-index: 1

Wanling Li

Citations: 38

h-index: 2

Hui Tian

Citations: 6

h-index: 2

Zipei Zhang

Citations: 6,293

h-index: 44

LLM 기반 에이전트는 복잡한 작업을 자율적으로 해결하기 위해 점점 더 많이 사용되고 있으며, 이는 지적 재산 보호 및 규제적 추적 가능성에 대한 긴급한 요구를 야기합니다. 콘텐츠 워터마킹은 LLM이 생성한 결과물을 효과적으로 출처를 밝히지만, 다단계 실행을 지배하는 고수준 계획 행동(예: 도구 및 하위 목표 선택)을 직접적으로 식별하지 못합니다. 특히, 계획 행동 수준에서의 워터마킹은 독특한 과제를 안고 있습니다. 의사 결정에서의 작은 분포 변화는 장기 에이전트 운영 중에 누적되어 유틸리티를 저하시킬 수 있으며, 많은 에이전트는 직접 개입하기 어려운 블랙박스로 작동합니다. 이러한 격차를 해소하기 위해, 우리는 유틸리티를 유지하면서 계획 결정에 다중 비트 식별자를 포함하는 행동 워터마킹 프레임워크인 AgentMark를 제안합니다. AgentMark는 에이전트로부터 명시적인 행동 분포를 추출하고 분포를 유지하는 조건부 샘플링을 적용하여, 블랙박스 API 환경에서도 사용 가능하도록 하며, 동시에 액션-레벨 콘텐츠 워터마킹과 호환됩니다. 다양한 환경(구체화된 환경, 도구 사용 환경, 사회적 환경)에서의 실험 결과, 실질적인 다중 비트 용량, 부분 로그에서도 견고한 복구, 그리고 유틸리티 유지 기능이 입증되었습니다. 코드는 https://github.com/Tooooa/AgentMark 에서 확인할 수 있습니다.

Original Abstract

LLM-based agents are increasingly deployed to autonomously solve complex tasks, raising urgent needs for IP protection and regulatory provenance. While content watermarking effectively attributes LLM-generated outputs, it fails to directly identify the high-level planning behaviors (e.g., tool and subgoal choices) that govern multi-step execution. Critically, watermarking at the planning-behavior layer faces unique challenges: minor distributional deviations in decision-making can compound during long-term agent operation, degrading utility, and many agents operate as black boxes that are difficult to intervene in directly. To bridge this gap, we propose AgentMark, a behavioral watermarking framework that embeds multi-bit identifiers into planning decisions while preserving utility. It operates by eliciting an explicit behavior distribution from the agent and applying distribution-preserving conditional sampling, enabling deployment under black-box APIs while remaining compatible with action-layer content watermarking. Experiments across embodied, tool-use, and social environments demonstrate practical multi-bit capacity, robust recovery from partial logs, and utility preservation. The code is available at https://github.com/Tooooa/AgentMark.

1 Citations

1 Influential

63.023463096955 Altmetric

318.1 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!