2603.05310v1 Mar 05, 2026 cs.SD

Latent-Mark: 신경망 재합성 공격에 강건한 오디오 워터마킹 기법

Latent-Mark: An Audio Watermark Robust to Neural Resynthesis

Yi-Cheng Lin

National Taiwan University

Citations: 534

h-index: 12

Yen-Shan Chen

Citations: 8

h-index: 2

Shih-Yu Lai

Citations: 8

h-index: 2

Ying-Jung Tsou

Citations: 0

h-index: 0

Yun-Nung Chen

Citations: 3

h-index: 1

Hung-yi Lee

Citations: 409

h-index: 11

Shang-Tse Chen

Citations: 40

h-index: 4

Bingwei Chen

Citations: 22

h-index: 2

기존의 오디오 워터마킹 기술은 전통적인 디지털 신호 처리(DSP) 공격에 대한 강한 강건성을 제공하지만, 신경망 재합성 공격에는 취약합니다. 이는 최신 신경망 오디오 코덱이 의미론적 필터 역할을 하여 이전 워터마킹 방법에서 사용되는 인지할 수 없는 파형 변동을 제거하기 때문입니다. 이러한 제한점을 해결하기 위해, 우리는 의미론적 압축에 강건하도록 설계된 최초의 제로-비트 오디오 워터마킹 프레임워크인 Latent-Mark를 제안합니다. 우리의 핵심 아이디어는 워터마크의 강건성을 확보하기 위해서는 워터마크를 코덱의 불변적인 잠재 공간 내에 임베딩해야 한다는 것입니다. 우리는 오디오 파형을 최적화하여 인코딩된 잠재 표현에서 감지 가능한 방향성 변화를 유도하는 동시에, 인지적 영향을 최소화하기 위해 파형의 변화가 자연스러운 오디오 공간에 맞도록 제한합니다. 단일 코덱의 양자화 규칙에 대한 과적합을 방지하기 위해, 우리는 Cross-Codec Optimization이라는 기술을 도입하여 여러 대체 코덱을 사용하여 파형을 동시에 최적화하고, 공통된 잠재적 불변성을 목표로 합니다. 광범위한 실험 결과는 새로운 신경망 코덱으로의 강력한 제로-샷 전이성을 보여주며, 기존의 DSP 공격에 대한 최첨단 수준의 강건성을 제공하는 동시에 인지적 투명성을 유지합니다. 우리의 연구는 점점 더 복잡하고 다양한 생성 왜곡에 대한 무결성을 유지할 수 있는 범용 워터마킹 프레임워크에 대한 미래 연구에 영감을 줄 것입니다.

Original Abstract

While existing audio watermarking techniques have achieved strong robustness against traditional digital signal processing (DSP) attacks, they remain vulnerable to neural resynthesis. This occurs because modern neural audio codecs act as semantic filters and discard the imperceptible waveform variations used in prior watermarking methods. To address this limitation, we propose Latent-Mark, the first zero-bit audio watermarking framework designed to survive semantic compression. Our key insight is that robustness to the encode-decode process requires embedding the watermark within the codec's invariant latent space. We achieve this by optimizing the audio waveform to induce a detectable directional shift in its encoded latent representation, while constraining perturbations to align with the natural audio manifold to ensure imperceptibility. To prevent overfitting to a single codec's quantization rules, we introduce Cross-Codec Optimization, jointly optimizing the waveform across multiple surrogate codecs to target shared latent invariants. Extensive evaluations demonstrate robust zero-shot transferability to unseen neural codecs, achieving state-of-the-art resilience against traditional DSP attacks while preserving perceptual imperceptibility. Our work inspires future research into universal watermarking frameworks capable of maintaining integrity across increasingly complex and diverse generative distortions.

0 Citations

0 Influential

6 Altmetric

30.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!