2601.00327v1 Jan 01, 2026 cs.CV

HarmoniAD: 지역 구조와 글로벌 의미를 조화시켜 이상 탐지 성능 향상

HarmoniAD: Harmonizing Local Structures and Global Semantics for Anomaly Detection

Naiqi Zhang

Citations: 3

h-index: 1

Chuancheng Shi

Citations: 25

h-index: 3

Jingtong Dou

Citations: 16

h-index: 2

Wenhua Wu

Citations: 21

h-index: 3

Fei Shen

Citations: 29

h-index: 3

Jianhua Cao

Citations: 5

h-index: 1

이상 탐지는 산업 제품 품질 검사에서 매우 중요합니다. 미세한 결함을 탐지하지 못하는 경우 심각한 결과를 초래할 수 있습니다. 기존 방법들은 구조와 의미 간의 균형을 맞추는 데 어려움을 겪습니다. 구조 지향 모델(예: 주파수 기반 필터)은 노이즈에 민감하고, 의미 지향 모델(예: CLIP 기반 인코더)은 세부 정보를 놓치는 경향이 있습니다. 이러한 문제를 해결하기 위해, 우리는 주파수 정보를 활용하는 이중 분기 프레임워크인 HarmoniAD를 제안합니다. 먼저 CLIP 이미지 인코더를 사용하여 특징을 추출하고, 이를 주파수 영역으로 변환한 후, 구조와 의미를 상호 보완적으로 모델링하기 위해 고주파 및 저주파 경로로 분리합니다. 고주파 경로는 미세한 결함을 탐지하기 위해 질감과 윤곽선을 강조하는 정밀 구조 어텐션 모듈(FSAM)을 사용하고, 저주파 경로는 장거리 의존성을 파악하고 의미 일관성을 유지하기 위해 글로벌 구조 컨텍스트 모듈(GSCM)을 사용합니다. 이러한 경로는 세부 정보와 글로벌 의미 간의 균형을 맞춥니다. HarmoniAD는 또한 다중 클래스 공동 훈련 전략을 채택하며, MVTec-AD, VisA 및 BTAD 데이터셋에 대한 실험 결과, 민감도와 견고성 측면에서 최첨단 성능을 보여줍니다.

Original Abstract

Anomaly detection is crucial in industrial product quality inspection. Failing to detect tiny defects often leads to serious consequences. Existing methods face a structure-semantics trade-off: structure-oriented models (such as frequency-based filters) are noise-sensitive, while semantics-oriented models (such as CLIP-based encoders) often miss fine details. To address this, we propose HarmoniAD, a frequency-guided dual-branch framework. Features are first extracted by the CLIP image encoder, then transformed into the frequency domain, and finally decoupled into high- and low-frequency paths for complementary modeling of structure and semantics. The high-frequency branch is equipped with a fine-grained structural attention module (FSAM) to enhance textures and edges for detecting small anomalies, while the low-frequency branch uses a global structural context module (GSCM) to capture long-range dependencies and preserve semantic consistency. Together, these branches balance fine detail and global semantics. HarmoniAD further adopts a multi-class joint training strategy, and experiments on MVTec-AD, VisA, and BTAD show state-of-the-art performance with both sensitivity and robustness.

0 Citations

0 Influential

1.5 Altmetric

7.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!