2603.25250v1 Mar 26, 2026 cs.CV

활성화가 중요하다: 비전-언어 모델을 사용한 이상 감지를 위한 테스트 시 활성화된 부정 라벨

Activation Matters: Test-time Activated Negative Labels for OOD Detection with Vision-Language Models

Yunhe Gao

Citations: 1,708

h-index: 17

M. Varma

Citations: 748

h-index: 10

C. Langlotz

Citations: 694

h-index: 13

Yabin Zhang

Citations: 163

h-index: 5

Jean-Benoit Delbrouck

Citations: 2,603

h-index: 22

Jiaming Liu

Citations: 224

h-index: 3

Chongli Wang

Citations: 1

h-index: 1

이상 감지는 데이터 분포 내에 있는 샘플과 다른 샘플을 식별하는 것을 목표로 합니다. 한 가지 인기 있는 방법은 데이터 분포 내 클래스와 거리가 먼 부정 라벨을 도입하고, 이 라벨과의 거리를 기반으로 이상을 감지하는 것입니다. 그러나 이러한 라벨은 이상 샘플에 대한 활성화 값이 낮을 수 있으며, 이는 이상 특징을 제대로 반영하지 못할 수 있습니다. 이를 해결하기 위해, 우리는 테스트 시 활성화된 부정 라벨 (TANL)을 제안합니다. TANL은 전체 데이터셋에 대한 활성화 수준을 동적으로 평가하고, 테스트 과정에서 높은 활성화 반응을 보이는 후보 라벨을 추출합니다. 구체적으로, TANL은 높은 신뢰도를 가진 테스트 이미지를 실시간으로 식별하고, 이 이미지들이 전체 데이터셋에 대해 가지는 할당 확률을 누적하여 라벨 활성화 지표를 구성합니다. 이러한 지표는 과거의 테스트 샘플을 활용하여 테스트 분포에 적응적으로 맞춰지도록 하며, 이를 통해 데이터 분포에 적합한 활성화된 부정 라벨을 선택할 수 있습니다. 또한, 현재 테스트 배치 내의 활성화 정보를 더욱 세밀하게 활용하여, 배치 적응형 변형을 추가했습니다. 라벨 활성화 정보를 최대한 활용하기 위해, 우리는 활성화 정보를 고려하는 점수 함수를 제안합니다. 이 함수는 더 높은 활성화 값을 가진 부정 라벨에 더 높은 가중치를 부여하여 성능을 향상시키고, 라벨 수에 대한 강건성을 높입니다. TANL은 학습이 필요 없으며, 테스트 효율성이 높고, 이론적인 근거를 가지고 있습니다. 다양한 백본과 광범위한 작업 환경에서의 실험 결과, TANL의 효과성을 검증했습니다. 특히, 대규모 ImageNet 벤치마크에서 TANL은 FPR95를 17.5%에서 9.8%로 크게 감소시켰습니다. 관련 코드는 다음 링크에서 확인할 수 있습니다: [https://github.com/YBZh/OpenOOD-VLM](https://github.com/YBZh/OpenOOD-VLM)

Original Abstract

Out-of-distribution (OOD) detection aims to identify samples that deviate from in-distribution (ID). One popular pipeline addresses this by introducing negative labels distant from ID classes and detecting OOD based on their distance to these labels. However, such labels may present poor activation on OOD samples, failing to capture the OOD characteristics. To address this, we propose \underline{T}est-time \underline{A}ctivated \underline{N}egative \underline{L}abels (TANL) by dynamically evaluating activation levels across the corpus dataset and mining candidate labels with high activation responses during the testing process. Specifically, TANL identifies high-confidence test images online and accumulates their assignment probabilities over the corpus to construct a label activation metric. Such a metric leverages historical test samples to adaptively align with the test distribution, enabling the selection of distribution-adaptive activated negative labels. By further exploring the activation information within the current testing batch, we introduce a more fine-grained, batch-adaptive variant. To fully utilize label activation knowledge, we propose an activation-aware score function that emphasizes negative labels with stronger activations, boosting performance and enhancing its robustness to the label number. Our TANL is training-free, test-efficient, and grounded in theoretical justification. Experiments on diverse backbones and wide task settings validate its effectiveness. Notably, on the large-scale ImageNet benchmark, TANL significantly reduces the FPR95 from 17.5\% to 9.8\%. Codes are available at \href{https://github.com/YBZh/OpenOOD-VLM}{YBZh/OpenOOD-VLM}.

0 Citations

0 Influential

48.005986908311 Altmetric

240.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!