2602.17645v1 Feb 19, 2026 cs.LG

세밀한 디테일 타겟팅을 통한 블랙박스 LVLM 공격의 최전선 확장

Pushing the Frontier of Black-Box LVLM Attacks via Fine-Grained Detail Targeting

Xiaohan Zhao

Citations: 82

h-index: 6

Zhaoyi Li

Citations: 57

h-index: 3

Zhiqiang Shen

Citations: 75

h-index: 5

Yaxin Luo

Citations: 56

h-index: 5

Jiacheng Cui

Citations: 63

h-index: 4

대형 비전-언어 모델(LVLM)에 대한 블랙박스 적대적 공격은 그래디언트 부재와 복잡한 멀티모달 경계로 인해 매우 까다롭습니다. M-Attack과 같은 기존의 최고 성능을 자랑하는 전이 기반 접근법은 원본 이미지와 타겟 이미지 간의 로컬 크롭(crop) 수준 매칭을 사용하여 우수한 성능을 보이지만, 우리는 이 방법이 반복(iteration) 과정 전반에 걸쳐 분산이 높고 거의 직교하는 그래디언트를 유발하여 일관된 로컬 정렬을 방해하고 최적화를 불안정하게 만든다는 것을 발견했습니다. 우리는 그 원인을 (i) 스파이크 형태의 그래디언트를 발생시키는 ViT의 이동 민감성(translation sensitivity)과 (ii) 원본 및 타겟 크롭 간의 구조적 비대칭성으로 파악했습니다. 우리는 로컬 매칭을 원본 변환과 타겟 의미론에 대한 비대칭적 기댓값으로 재구성하고, M-Attack을 위한 그래디언트 노이즈 제거 업그레이드를 구축했습니다. 원본 측면에서는 다중 크롭 정렬(MCA)을 통해 반복마다 독립적으로 샘플링된 여러 로컬 뷰의 그래디언트를 평균화하여 분산을 줄입니다. 타겟 측면에서는 보조 타겟 정렬(ATA)을 통해 과도한 타겟 증강을 의미적으로 연관된 분포에서 추출한 소규모 보조 세트로 대체하여, 더 부드럽고 분산이 낮은 타겟 매니폴드를 생성합니다. 나아가 우리는 모멘텀을 이전 크롭 그래디언트를 재사용하는 패치 모멘텀(Patch Momentum)으로 재해석했으며, 이를 개선된 패치 크기 앙상블(PE+)과 결합하여 전이 가능한 방향성을 강화했습니다. 이러한 모듈들이 결합되어 M-Attack에 대한 간단하고 모듈식 개선안인 M-Attack-V2를 구성하며, 이는 최신 LVLM에 대한 전이 기반 블랙박스 공격을 크게 향상시킵니다. 결과적으로 Claude-4.0에서의 성공률을 8%에서 30%로, Gemini-2.5-Pro를 83%에서 97%로, GPT-5를 98%에서 100%로 끌어올려 기존의 블랙박스 LVLM 공격 성능을 능가합니다. 코드와 데이터는 https://github.com/vila-lab/M-Attack-V2 에서 공개적으로 확인할 수 있습니다.

Original Abstract

Black-box adversarial attacks on Large Vision-Language Models (LVLMs) are challenging due to missing gradients and complex multimodal boundaries. While prior state-of-the-art transfer-based approaches like M-Attack perform well using local crop-level matching between source and target images, we find this induces high-variance, nearly orthogonal gradients across iterations, violating coherent local alignment and destabilizing optimization. We attribute this to (i) ViT translation sensitivity that yields spike-like gradients and (ii) structural asymmetry between source and target crops. We reformulate local matching as an asymmetric expectation over source transformations and target semantics, and build a gradient-denoising upgrade to M-Attack. On the source side, Multi-Crop Alignment (MCA) averages gradients from multiple independently sampled local views per iteration to reduce variance. On the target side, Auxiliary Target Alignment (ATA) replaces aggressive target augmentation with a small auxiliary set from a semantically correlated distribution, producing a smoother, lower-variance target manifold. We further reinterpret momentum as Patch Momentum, replaying historical crop gradients; combined with a refined patch-size ensemble (PE+), this strengthens transferable directions. Together these modules form M-Attack-V2, a simple, modular enhancement over M-Attack that substantially improves transfer-based black-box attacks on frontier LVLMs: boosting success rates on Claude-4.0 from 8% to 30%, Gemini-2.5-Pro from 83% to 97%, and GPT-5 from 98% to 100%, outperforming prior black-box LVLM attacks. Code and data are publicly available at: https://github.com/vila-lab/M-Attack-V2.

3 Citations

1 Influential

23 Altmetric

120.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!