2605.03410v1 May 05, 2026 cs.AI

밀도 기반 기하학: 소량 샘플 기반 도메인 간 이상 탐지

Geometry over Density: Few-Shot Cross-Domain OOD Detection

Shawn Li

Citations: 4

h-index: 1

C. Peris

Citations: 19

h-index: 3

Roger Zimmermann

Citations: 69

h-index: 5

Jiate Li

Citations: 15

h-index: 3

Youxuan Qin

Citations: 124

h-index: 4

Lisa Bauer

Citations: 81

h-index: 2

Yue Zhao

Citations: 100

h-index: 6

이상 탐지(OOD detection)는 모델의 학습 분포 범위를 벗어나는 테스트 샘플을 식별하는 기술로, 고위험 분야에서 안전하게 시스템을 운영하는 데 필수적입니다. 기존의 OOD 탐지기는 특정 학습 데이터셋(ID dataset)으로 학습되며, 해당 도메인에서의 벗어남을 탐지합니다. 반면, 본 연구에서는 소량 샘플 기반의 도메인 간 이상 탐지를 다룹니다. 즉, 단일의 사전 학습된 모델을 사용하여, 추론 시에 극소수의 ID 샘플만을 활용하여 임의의 새로운 ID-OOD 작업 쌍에 대해 이상 탐지를 수행할 수 있는가에 대한 문제입니다. 본 연구에서는 정보 기하학적 분석을 통해 이러한 목표를 달성하는 통합 프레임워크인 **UFCOD**를 제안합니다. 핵심적인 통찰은 확산 노이즈 예측이 스코어 함수(로그 밀도의 기울기)라는 점이며, 이를 바탕으로 두 가지 에너지 특징인 *경로 에너지(Path Energy)* (통합된 스코어 크기)와 *동역학 에너지(Dynamics Energy)* (스코어의 부드러움)를 추출합니다. 이 두 특징은 학습된 확산 과정을 통해 샘플들이 어떻게 상호 작용하는지를 나타내는 이산 소보레프 노름을 형성합니다. UFCOD의 핵심적인 기여는 **한 번 학습 후, 어디든 적용 가능**하는 패러다임입니다. 즉, 단일 데이터셋(예: CelebA)으로 학습된 확산 모델은 의미적으로 관련 없는 다양한 도메인(예: CIFAR-10, SVHN, 텍스처)에서의 이상 탐지를 위한 범용 특징 추출기로 사용될 수 있습니다. 배포 시, 각 새로운 작업에 대해 약 100개의 레이블이 없는 ID 샘플만 추론에 사용되며, 재학습, 미세 조정, 또는 작업별 특화 과정이 필요하지 않습니다. UFCOD는 각 작업당 약 100개의 ID 샘플을 사용하여 12개의 다양한 도메인 벤치마크에서 평균 93.7%의 AUROC를 달성하며, 이는 5만~16만 개의 샘플로 학습된 기존 방법들과 경쟁력 있는 성능을 보이며, 샘플 효율성 측면에서 약 500배의 개선을 보여줍니다. 관련 코드는 https://github.com/lili0415/UFCOD 에서 확인할 수 있습니다.

Original Abstract

Out-of-distribution (OOD) detection identifies test samples that fall outside a model's training distribution, a capability critical for safe deployment in high-stakes applications. Standard OOD detectors are trained on a specific in-distribution (ID) dataset and detect deviations from that single domain. In contrast, we study few-shot cross-domain OOD detection: given a \emph{single} pre-trained model, can we perform OOD detection on \emph{arbitrary} new ID-OOD task pairs using only a handful of ID samples at inference time, with no additional training? We propose \textbf{UFCOD}, a unified framework that achieves this goal through information-geometric analysis of diffusion trajectories. Our key insight is that diffusion noise predictions are score functions (gradients of log-density), and we extract two energy features: \emph{Path Energy} (integrated score magnitude) and \emph{Dynamics Energy} (score smoothness), that form a discrete Sobolev norm capturing how samples interact with the learned diffusion process. The central contribution is a \textbf{train-once, deploy-anywhere} paradigm: a diffusion model trained on a single dataset (e.g., CelebA) serves as a universal feature extractor for OOD detection across semantically unrelated domains (e.g., CIFAR-10, SVHN, Textures). At deployment, each new task requires only $\sim$100 unlabeled ID samples for inference: no retraining, no fine-tuning, no task-specific adaptation. Using 100 ID samples per task, UFCOD achieves 93.7\% average AUROC across 12 cross-domain benchmarks, competitive with methods trained on 50k--163k samples, demonstrating $\sim$500$\times$ improvement in sample efficiency. See our code in https://github.com/lili0415/UFCOD.

0 Citations

0 Influential

26.4657359028 Altmetric

132.3 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!