2602.17568v1 Feb 19, 2026 cs.LG

시계열 전처리에 주의하라

Be Wary of Your Time Series Preprocessing

Lele Cao

Citations: 27

h-index: 3

Sofiane Ennadir

Citations: 99

h-index: 5

Tianze Wang

Citations: 21

h-index: 3

Oleg Smirnov

Citations: 22

h-index: 2

Sahar Asadi

Citations: 254

h-index: 3

정규화와 스케일링은 시계열 모델링에서 기본적인 전처리 단계이지만, 트랜스포머 기반 모델에서 이들의 역할은 이론적 관점에서 아직 충분히 탐구되지 않았습니다. 본 연구에서는 특히 인스턴스 기반 및 전역 스케일링과 같은 다양한 정규화 전략이 시계열 표현 학습을 위한 트랜스포머 기반 아키텍처의 표현력에 미치는 영향에 대한 최초의 공식적인 분석을 제시합니다. 우리는 표현 공간에서 유사한 입력과 상이한 입력을 구별하는 모델의 능력을 정량화하는, 시계열에 맞춤화된 새로운 표현력 프레임워크를 제안합니다. 이 프레임워크를 사용하여 널리 사용되는 두 가지 정규화 방법인 표준(Standard) 및 최소-최대(Min-Max) 스케일링에 대한 이론적 한계를 도출합니다. 우리의 분석은 정규화 전략의 선택이 작업 및 데이터 특성에 따라 모델의 표현 능력에 상당한 영향을 미칠 수 있음을 보여줍니다. 여러 트랜스포머 기반 모델을 사용한 분류 및 예측 벤치마크에서의 경험적 검증을 통해 우리의 이론을 보완합니다. 연구 결과에 따르면 단일 정규화 방법이 일관되게 다른 방법보다 우수한 것은 아니며, 경우에 따라 정규화를 완전히 생략하는 것이 더 뛰어난 성능으로 이어지기도 합니다. 이러한 발견은 시계열 학습에서 전처리의 중요한 역할을 강조하며, 특정 작업 및 데이터셋에 맞춤화된 보다 원칙적인 정규화 전략의 필요성을 제기합니다.

Original Abstract

Normalization and scaling are fundamental preprocessing steps in time series modeling, yet their role in Transformer-based models remains underexplored from a theoretical perspective. In this work, we present the first formal analysis of how different normalization strategies, specifically instance-based and global scaling, impact the expressivity of Transformer-based architectures for time series representation learning. We propose a novel expressivity framework tailored to time series, which quantifies a model's ability to distinguish between similar and dissimilar inputs in the representation space. Using this framework, we derive theoretical bounds for two widely used normalization methods: Standard and Min-Max scaling. Our analysis reveals that the choice of normalization strategy can significantly influence the model's representational capacity, depending on the task and data characteristics. We complement our theory with empirical validation on classification and forecasting benchmarks using multiple Transformer-based models. Our results show that no single normalization method consistently outperforms others, and in some cases, omitting normalization entirely leads to superior performance. These findings highlight the critical role of preprocessing in time series learning and motivate the need for more principled normalization strategies tailored to specific tasks and datasets.

0 Citations

0 Influential

2.5 Altmetric

12.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!