2604.07763v1 Apr 09, 2026 cs.CV

표면적인 특징 너머로: 다양한 모달리티에서 공유되는 잠재적인 위조 지식 파악

Beyond Surface Artifacts: Capturing Shared Latent Forgery Knowledge Across Modalities

Chuancheng Shi

Citations: 25

h-index: 3

Jingtong Dou

Citations: 16

h-index: 2

Fei Shen

Citations: 29

h-index: 3

T. Chua

Citations: 33

h-index: 3

Jian Wang

Citations: 3

h-index: 1

Zhiyong Wang

Citations: 72

h-index: 3

생성형 인공지능이 발전함에 따라, 딥페이크 공격은 단일 모달리티 조작에서 복잡하고 다중 모달리티 위협으로 확대되었습니다. 기존의 법의학적 기술은 표면적이고 모달리티별 특징에 지나치게 의존하기 때문에, 다양한 물리적 표현 아래 숨겨진 공유된 잠재적인 위조 지식을 간과하며 심각한 일반화 성능 저하 문제를 겪습니다. 결과적으로, 이러한 모델은 이전에 보지 못한 "어둠의 모달리티"에 직면했을 때 성능이 급격히 저하됩니다. 이러한 한계를 극복하기 위해, 본 논문에서는 다중 모달리티 법의학을 기존의 "특징 융합" 방식에서 "모달리티 일반화" 방식으로 재정의하는 새로운 패러다임을 제시합니다. 우리는 최초의 모달리티 불변 위조(MAF) 탐지 프레임워크를 제안합니다. MAF는 모달리티별 스타일을 명시적으로 분리하여, 필수적인 교차 모달리티 잠재적인 위조 지식을 정확하게 추출합니다. 또한, 모델의 일반화 능력을 정량화하기 위해 두 가지 점진적인 차원을 정의합니다. 즉, 의미적으로 관련된 모달리티로의 전달 가능성(Weak MAF)과 완전히 분리된 "어둠의 모달리티" 신호에 대한 견고성(Strong MAF)입니다. 이러한 일반화 한계를 엄격하게 평가하기 위해, 다양한 다중 모달리티 위조 탐지 알고리즘을 통합하고 최첨단 일반화 학습 방법을 적용한 DeepModal-Bench 벤치마크를 소개합니다. 본 연구는 보편적인 위조 흔적이 실제로 존재한다는 것을 경험적으로 입증했을 뿐만 아니라, MAF 프레임워크를 통해 알려지지 않은 모달리티에서 상당한 성능 향상을 달성하여, 보편적인 다중 모달리티 방어를 위한 선구적인 기술적 경로를 제시합니다.

Original Abstract

As generative artificial intelligence evolves, deepfake attacks have escalated from single-modality manipulations to complex, multimodal threats. Existing forensic techniques face a severe generalization bottleneck: by relying excessively on superficial, modality-specific artifacts, they neglect the shared latent forgery knowledge hidden beneath variable physical appearances. Consequently, these models suffer catastrophic performance degradation when confronted with unseen "dark modalities." To break this limitation, this paper introduces a paradigm shift that redefines multimodal forensics from conventional "feature fusion" to "modality generalization." We propose the first modality-agnostic forgery (MAF) detection framework. By explicitly decoupling modality-specific styles, MAF precisely extracts the essential, cross-modal latent forgery knowledge. Furthermore, we define two progressive dimensions to quantify model generalization: transferability toward semantically correlated modalities (Weak MAF), and robustness against completely isolated signals of "dark modality" (Strong MAF). To rigorously assess these generalization limits, we introduce the DeepModal-Bench benchmark, which integrates diverse multimodal forgery detection algorithms and adapts state-of-the-art generalized learning methods. This study not only empirically proves the existence of universal forgery traces but also achieves significant performance breakthroughs on unknown modalities via the MAF framework, offering a pioneering technical pathway for universal multimodal defense.

2 Citations

0 Influential

1.5 Altmetric

9.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!