2604.07802v1 Apr 09, 2026 cs.CV

잠재된 이상 지식 발굴: 시각-언어 모델에서 희소하고 민감한 뉴런 탐색

Latent Anomaly Knowledge Excavation: Unveiling Sparse Sensitive Neurons in Vision-Language Models

Chuancheng Shi

Citations: 7

h-index: 1

Wenhua Wu

Citations: 6

h-index: 1

Fei Shen

Citations: 6

h-index: 1

Shangze Li

Citations: 41

h-index: 3

T. Chua

Citations: 12

h-index: 3

Shaotian Li

Citations: 12

h-index: 2

Yanqi Wu

Citations: 22

h-index: 2

Xiaohan Yu

Citations: 0

h-index: 0

대규모 시각-언어 모델(VLMs)은 뛰어난 제로샷 성능을 보이지만, 이러한 모델의 이상 탐지(AD) 성능을 이끄는 내부 메커니즘은 아직 제대로 이해되지 못하고 있습니다. 현재의 방법들은 주로 VLM을 블랙박스 형태의 특징 추출기로 취급하며, 이상과 관련된 지식은 외부 어댑터나 메모리 뱅크를 통해 획득해야 한다고 가정합니다. 본 논문에서는 이러한 가정을 비판적으로 검토하며, 이상 지식이 사전 학습된 모델 내부에 내재되어 있지만 잠재 상태이며 활성화되지 않은 상태로 존재한다고 주장합니다. 우리는 이러한 지식이 희소한 부분집합의 이상 민감 뉴런에 집중되어 있다고 가정합니다. 이를 검증하기 위해, 우리는 잠재된 이상 지식 발굴(LAKE)이라는, 학습 과정이 필요 없는 프레임워크를 제안합니다. LAKE는 최소한의 정상 샘플만을 사용하여 이러한 중요한 신경 신호를 식별하고 활성화합니다. LAKE는 이러한 민감한 뉴런을 분리하여 시각적 구조적 이상과 멀티모달 의미 활성화를 통합하는 매우 압축적인 정상 표현을 구축합니다. 산업용 이상 탐지 벤치마크에서의 광범위한 실험 결과, LAKE는 최첨단 성능을 달성하며, 동시에 뉴런 수준의 내재적인 해석 가능성을 제공합니다. 궁극적으로, 본 연구는 패러다임 전환을 지향합니다. 즉, 이상 탐지를 하위 작업에 대한 지식 획득이 아닌, 잠재된 사전 학습된 지식의 목표 활성화로 재정의하는 것입니다.

Original Abstract

Large-scale vision-language models (VLMs) exhibit remarkable zero-shot capabilities, yet the internal mechanisms driving their anomaly detection (AD) performance remain poorly understood. Current methods predominantly treat VLMs as black-box feature extractors, assuming that anomaly-specific knowledge must be acquired through external adapters or memory banks. In this paper, we challenge this assumption by arguing that anomaly knowledge is intrinsically embedded within pre-trained models but remains latent and under-activated. We hypothesize that this knowledge is concentrated within a sparse subset of anomaly-sensitive neurons. To validate this, we propose latent anomaly knowledge excavation (LAKE), a training-free framework that identifies and elicits these critical neuronal signals using only a minimal set of normal samples. By isolating these sensitive neurons, LAKE constructs a highly compact normality representation that integrates visual structural deviations with cross-modal semantic activations. Extensive experiments on industrial AD benchmarks demonstrate that LAKE achieves state-of-the-art performance while providing intrinsic, neuron-level interpretability. Ultimately, our work advocates for a paradigm shift: redefining anomaly detection as the targeted activation of latent pre-trained knowledge rather than the acquisition of a downstream task.

0 Citations

0 Influential

1.5 Altmetric

7.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!