2603.17680v1 Mar 18, 2026 cs.CV

WeatherReasonSeg: 시각적 언어 모델에서 날씨 인지 추론 분할을 위한 벤치마크

WeatherReasonSeg: A Benchmark for Weather-Aware Reasoning Segmentation in Visual Language Models

Fucai Ke

Citations: 157

h-index: 6

Wanjun Du

Citations: 7

h-index: 2

Zifeng Yuan

Citations: 42

h-index: 3

Tingting Chen

Citations: 56

h-index: 4

Beibei Lin

Citations: 71

h-index: 5

Shunli Zhang

Citations: 230

h-index: 5

기존의 시각-언어 모델(VLM)은 추론 기반 분할에서 뛰어난 성능을 보여주었습니다. 그러나 현재 벤치마크는 주로 이상적인 조건에서 촬영된 고품질 이미지로 구성되어 있습니다. 이러한 상황에서 비가, 눈, 또는 안개와 같은 악천후 조건으로 인해 시각적 정보가 심각하게 손상될 경우, VLM이 안정적인 추론 기반 분할 능력을 유지할 수 있는가라는 중요한 질문이 제기됩니다. 이러한 문제에 대응하기 위해, 우리는 악천후 조건에서 VLM의 추론 기반 분할 성능을 평가하기 위한 벤치마크인 WeatherReasonSeg을 소개합니다. 이는 두 가지 상호 보완적인 구성 요소로 구성됩니다. 첫째, 기존의 분할 데이터셋에 다양한 강도의 인공적인 악천후 효과를 적용하여 제어 가능한 추론 데이터셋을 구축함으로써, 미세한 수준의 강건성 분석을 가능하게 합니다. 둘째, 실제 복잡성을 반영하기 위해, 마스크 기반 LLM 프롬프팅을 통해 생성된 의미적으로 일관된 쿼리를 포함하는 실제 악천후 추론 분할 데이터셋을 구성했습니다. 또한, 기능, 응용 시나리오, 구조적 속성, 상호 작용, 요구 사항 일치 등 다섯 가지 추론 차원을 포함하여 평가 범위를 확장했습니다. 다양한 VLM에 대한 광범위한 실험을 통해 다음과 같은 두 가지 주요 결과를 얻었습니다. (1) VLM의 성능은 날씨의 심각도가 증가함에 따라 단조적으로 저하되며, (2) 서로 다른 유형의 날씨는 뚜렷한 취약성 패턴을 유발합니다. 우리는 WeatherReasonSeg이 견고하고 날씨 인지적인 추론을 발전시키는 데 기여할 수 있기를 바랍니다.

Original Abstract

Existing vision-language models (VLMs) have demonstrated impressive performance in reasoning-based segmentation. However, current benchmarks are primarily constructed from high-quality images captured under idealized conditions. This raises a critical question: when visual cues are severely degraded by adverse weather conditions such as rain, snow, or fog, can VLMs sustain reliable reasoning segmentation capabilities? In response to this challenge, we introduce WeatherReasonSeg, a benchmark designed to evaluate VLM performance in reasoning-based segmentation under adverse weather conditions. It consists of two complementary components. First, we construct a controllable reasoning dataset by applying synthetic weather with varying severity levels to existing segmentation datasets, enabling fine-grained robustness analysis. Second, to capture real-world complexity, we curate a real-world adverse-weather reasoning segmentation dataset with semantically consistent queries generated via mask-guided LLM prompting. We further broaden the evaluation scope across five reasoning dimensions, including functionality, application scenarios, structural attributes, interactions, and requirement matching. Extensive experiments across diverse VLMs reveal two key findings: (1) VLM performance degrades monotonically with increasing weather severity, and (2) different weather types induce distinct vulnerability patterns. We hope WeatherReasonSeg will serve as a foundation for advancing robust, weather-aware reasoning.

2 Citations

0 Influential

3 Altmetric

17.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!