2606.16173v1 Jun 15, 2026 cs.AI

TimeVista: Exploring and Exploiting Vision-Language Models as Judges for Time Series Forecasting

Jialong Wu
Jialong Wu
Tsinghua University
Citations: 691
h-index: 10
Mingsheng Long
Mingsheng Long
Citations: 115
h-index: 5
Jianmin Wang
Jianmin Wang
Citations: 80
h-index: 5
Haoran Zhang
Haoran Zhang
Citations: 2,850
h-index: 8
Xin Su
Xin Su
Citations: 63
h-index: 4
Yuxuan Wang
Yuxuan Wang
Citations: 788
h-index: 6
Zhi Chen
Zhi Chen
Citations: 136
h-index: 2
Yong Liu
Yong Liu
Citations: 22
h-index: 2

High-quality time series forecasting is pivotal for real-world decision-making. However, traditional point-wise metrics often fail to reveal complex temporal patterns and align poorly with human intuitive preferences. While the ''LLM-as-a-Judge'' paradigm has revolutionized text evaluation by providing flexible, human-aligned judgment, its application to time series remains largely unexplored. In this paper, we leverage Vision-Language Models (VLMs) as judges for time series forecasting, harnessing their ability to comprehend time series plots grounded in textual information. Specifically, we propose a novel framework integrating micro- and macro-level judgments informed by contextual information to evaluate time series forecasting. To this end, we introduce TimeVista, a comprehensive VLM-as-a-Judge benchmark comprising 5563 time series samples paired with detailed evaluation rubrics. Extensive meta-evaluations demonstrate that VLMs are highly reliable judges, achieving significantly higher consistency with human preferences than conventional metrics. Building upon our benchmark, we comprehensively assess recent Time Series Foundation Models (TSFMs) under the VLM-as-a-Judge paradigm. Our results demonstrate that VLMs serve as robust and interpretable judges, providing a comprehensive, human-aligned standard for evaluating time series models.

0 Citations
0 Influential
5 Altmetric
25.0 Score
Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

Log in to request an AI analysis.

댓글

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!