2606.05702v1 Jun 04, 2026 cs.AI

Seeing Time: Benchmarking Chronological Reasoning and Shortcut Biases in Vision-Language Models

Yongcheng Jing
Yongcheng Jing
Citations: 141
h-index: 7
Qixin Zhang
Qixin Zhang
Citations: 12
h-index: 2
Ziqi Xu
Ziqi Xu
Citations: 24
h-index: 3
Qing Qing
Qing Qing
Citations: 4
h-index: 1
Juncheng Hu
Juncheng Hu
Citations: 120
h-index: 6
Renqian Luo
Renqian Luo
Citations: 36
h-index: 3
Hao Zhou
Hao Zhou
Citations: 195
h-index: 5
Caichong Li
Caichong Li
Citations: 0
h-index: 0
Xikun Zhang
Xikun Zhang
Citations: 8
h-index: 1

Recent advancements in Vision-Language Models (VLMs) have significantly enhanced their ability to interpret complex visual semantics, yet their capacity for chronological reasoning remains under-explored. In this paper, we introduce a novel benchmark specifically designed to evaluate how VLMs perceive and reason about chronological information within and across images. Unlike existing video-based benchmarks that focus on frame sequencing, our work delves into the underlying logic of chronological judgment and the expansion toward multimodal integration. To facilitate this, we construct three specialized datasets: one containing visually similar objects spanning long historical durations, another categorized by diverse event and object types, and a third pairing images with time-sensitive news text for cross-modal alignment. Through extensive experiments, we analyze whether models exhibit performance disparities across categories and, crucially, explore whether they rely on ``incorrect shortcuts'', such as image color rather than genuine chronological features. Our results reveal that while VLMs show promise, they frequently exploit superficial cues like grayscale versus color filters to bypass authentic chronological reasoning. By providing these high-quality datasets and a rigorous evaluation framework, we offer a diagnostic tool to identify current limitations and guide the development of more robust, logically grounded multimodal models. The source code is shown in https://github.com/LuoRenqiang/ChronoVision.

0 Citations
0 Influential
23.5 Altmetric
117.5 Score
Original PDF
0

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

Log in to request an AI analysis.

댓글

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!