2603.23043v1 Mar 24, 2026 cs.LG

기후 기반 모델이 경험하지 못한 데이터 분포 변화에 대한 견고성 평가

Assessing the Robustness of Climate Foundation Models under No-Analog Distribution Shifts

Theo Wolf

Citations: 11

h-index: 1

Maria Conchita Agana Navarro

Citations: 1

h-index: 1

M. Pérez-Ortiz

Citations: 13

h-index: 2

Geng Li

Citations: 1

h-index: 1

기후 변화의 가속화는 머신러닝 기반 기후 모델링 도구의 일반화 능력을 저해하는 심각한 비정상성을 야기합니다. 이러한 모델링 도구는 기존의 지구 시스템 모델에 비해 계산 효율성이 뛰어나지만, '비정형' 미래 기후 상태에서 신뢰성이 문제가 될 수 있습니다. 여기서 '비정형' 미래 기후 상태란, 외부 요인이 시스템을 역사적 훈련 데이터의 경험 범위 밖의 조건으로 이끌 때 발생하는 현상을 의미합니다. 이러한 신뢰성을 평가하는 데 있어 중요한 과제는 데이터 오염 문제입니다. 많은 모델이 이미 미래 시나리오를 포함하는 시뮬레이션 데이터로 훈련되기 때문에, 실제 '분포 외(Out-of-Distribution, OOD)' 성능은 종종 가려집니다. 이를 해결하기 위해, 우리는 최첨단 아키텍처 세 가지, 즉 U-Net, ConvLSTM, 그리고 역사 데이터(1850-2014)만을 사용하여 훈련된 ClimaX 기반 모델의 OOD 견고성을 비교 분석했습니다. 우리는 두 가지 상호 보완적인 전략을 사용하여 이러한 모델을 평가했습니다. (i) 최근 기후(2015-2023)로의 시간 추론 및 (ii) 서로 다른 배출 경로에서의 시나리오 간 강제 변화입니다. 이 실험 설정을 통한 분석 결과, 정확성과 안정성 간의 균형 관계가 나타났습니다. ClimaX 기반 모델은 가장 낮은 절대 오차를 달성했지만, 외부 요인이 급격하게 변하는 시나리오에서 상대적인 성능 변화가 더 컸으며, 강수 오차가 최대 8.44%까지 증가했습니다. 이러한 결과는 역사 데이터만을 사용하여 훈련된 고성능 기반 모델조차도 외부 요인 경로에 민감하게 반응할 수 있음을 시사합니다. 우리의 결과는 변화하는 기후 환경에서 기후 모델링 도구의 견고성을 확보하기 위해 시나리오 기반 훈련 및 엄격한 OOD 평가 프로토콜의 필요성을 강조합니다.

Original Abstract

The accelerating pace of climate change introduces profound non-stationarities that challenge the ability of Machine Learning based climate emulators to generalize beyond their training distributions. While these emulators offer computationally efficient alternatives to traditional Earth System Models, their reliability remains a potential bottleneck under "no-analog" future climate states, which we define here as regimes where external forcing drives the system into conditions outside the empirical range of the historical training data. A fundamental challenge in evaluating this reliability is data contamination; because many models are trained on simulations that already encompass future scenarios, true out-of-distribution (OOD) performance is often masked. To address this, we benchmark the OOD robustness of three state-of-the-art architectures: U-Net, ConvLSTM, and the ClimaX foundation model specifically restricted to a historical-only training regime (1850-2014). We evaluate these models using two complementary strategies: (i) temporal extrapolation to the recent climate (2015-2023) and (ii) cross-scenario forcing shifts across divergent emission pathways. Our analysis within this experimental setup reveals an accuracy vs. stability trade-off: while the ClimaX foundation model achieves the lowest absolute error, it exhibits higher relative performance changes under distribution shifts, with precipitation errors increasing by up to 8.44% under extreme forcing scenarios. These findings suggest that when restricted to historical training dynamics, even high-capacity foundation models are sensitive to external forcing trajectories. Our results underscore the necessity of scenario-aware training and rigorous OOD evaluation protocols to ensure the robustness of climate emulators under a changing climate.

1 Citations

0 Influential

1 Altmetric

6.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!