2602.08868v1 Feb 09, 2026 cs.LG

AnomSeer: 다중 모드 대규모 언어 모델의 시계열 이상 탐지를 위한 추론 능력 강화

AnomSeer: Reinforcing Multimodal LLMs to Reason for Time-Series Anomaly Detection

Lang Feng

Citations: 729

h-index: 11

Junru Zhang

Citations: 171

h-index: 7

Haoran Shi

Citations: 3

h-index: 1

Xu Guo

Citations: 24

h-index: 3

Han Yu

Citations: 102

h-index: 4

Yabo Dong

Citations: 126

h-index: 6

Duanqing Xu

Citations: 33

h-index: 3

다중 모드 대규모 언어 모델(MLLM)을 활용한 시계열 이상 탐지(TSAD)는 새롭게 떠오르는 분야이지만, 여전히 해결해야 할 과제가 존재합니다. MLLM은 일반적으로 시계열 데이터에 대한 단순한 규칙을 사용하지만, 복잡한 시계열 데이터를 이해하는 데 필수적인 다차원적이고 세부적인 추론에는 어려움을 겪습니다. 본 연구에서는 AnomSeer를 제안하며, 이는 모델이 정확하고 구조적인 시계열 데이터의 세부 사항에 기반하여 추론하도록 강화하여 이상 분류, 위치 특정, 그리고 설명을 통합하는 것을 목표로 합니다. AnomSeer의 핵심은, 기존의 통계적 측정 및 주파수 변환과 같은 분석을 기반으로 검증 가능하고 세분화된 추론을 제공하는 전문가의 연쇄적 사고 과정을 생성하는 것입니다. 이를 바탕으로, 표준 강화 학습 외에 두 가지 추가적인 구성 요소를 포함하는 새로운 시계열 기반 정책 최적화(TimerPO)를 제안합니다. 첫째, 최적 수송 기반의 시계열 기반 보상 함수를 사용하고, 둘째, 보조적인 세분화된 신호가 주요 탐지 목표에 영향을 미치지 않도록 직교 투영을 적용합니다. 다양한 이상 탐지 시나리오에서, Qwen2.5-VL-3B/7B-Instruct 모델을 사용하는 AnomSeer는 분류 및 위치 정확도 측면에서 GPT-4o와 같은 더 큰 상용 모델을 능가했으며, 특히 특정 지점 및 주파수 기반의 이상 탐지에서 뛰어난 성능을 보였습니다. 또한, AnomSeer는 결론을 뒷받침하는 신뢰할 수 있는 시계열 추론 과정을 제공합니다.

Original Abstract

Time-series anomaly detection (TSAD) with multimodal large language models (MLLMs) is an emerging area, yet a persistent challenge remains: MLLMs rely on coarse time-series heuristics but struggle with multi-dimensional, detailed reasoning, which is vital for understanding complex time-series data. We present AnomSeer to address this by reinforcing the model to ground its reasoning in precise, structural details of time series, unifying anomaly classification, localization, and explanation. At its core, an expert chain-of-thought trace is generated to provide a verifiable, fine-grained reasoning from classical analyses (e.g., statistical measures, frequency transforms). Building on this, we propose a novel time-series grounded policy optimization (TimerPO) that incorporates two additional components beyond standard reinforcement learning: a time-series grounded advantage based on optimal transport and an orthogonal projection to ensure this auxiliary granular signal does not interfere with the primary detection objective. Across diverse anomaly scenarios, AnomSeer, with Qwen2.5-VL-3B/7B-Instruct, outperforms larger commercial baselines (e.g., GPT-4o) in classification and localization accuracy, particularly on point- and frequency-driven exceptions. Moreover, it produces plausible time-series reasoning traces that support its conclusions.

0 Citations

0 Influential

5.5 Altmetric

27.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!