2606.05718v1 Jun 04, 2026 cs.CV

ViCuR: Visual Cues as Recoverable Privilege for Multimodal On-Policy Distillation

Siyuan Liu
Siyuan Liu
Citations: 25
h-index: 3
Kanghui Tian
Kanghui Tian
Citations: 16
h-index: 2
Ziang Yan
Ziang Yan
Citations: 647
h-index: 9
Sheng Xia
Sheng Xia
Citations: 13
h-index: 1
Shuai Dong
Shuai Dong
Citations: 36
h-index: 3
Yi Wang
Yi Wang
Citations: 4,822
h-index: 18

On-policy distillation (OPD) improves reasoning by training a student on trajectories sampled from its own policy under supervision from a teacher. In multimodal reasoning, a common extension is to use a privileged teacher that observes training-time-only signals such as reference answers or rationales. However, such answer-side privilege creates a train-test mismatch: the teacher's supervision may depend on signals unavailable to the student, encouraging shortcut imitation rather than visually grounded reasoning. We propose ViCuR, a visually grounded privileged-teacher distillation framework that replaces answer-side privilege with visual cues (query-related evidence in the input). Because these cues are derived from the same visual input available at inference, their evidence is recoverable by the student. To support this, ViCuR introduces a lightweight cue recovery module that uses dedicated sink-token cross-attention during prefill to aggregate task-relevant visual evidence into an internal representation, without changing the inference interface or requiring auxiliary cue-generation losses. Across seven benchmarks with Qwen3-VL-2B and 8B students, ViCuR consistently improves over answer-based on-policy self-distillation by +1.19 and +1.24 on overall average performance. It also extends naturally to stronger-teacher OPD, surpassing OPD baselines by +0.64 and +1.08, with consistent out-of-domain gains at the 8B scale. These results show that, in multimodal on-policy distillation, the design of teacher privilege is as important as teacher strength.

1 Citations
0 Influential
9 Altmetric
46.0 Score
Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

Log in to request an AI analysis.

댓글

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!