2606.12169v1 Jun 10, 2026 cs.CV

OpenMedReason: Scientific Reasoning Supervision for Medical Vision-Language Models

Abeer Badawi
Abeer Badawi
Citations: 15
h-index: 1
Elham Dolatabadi
Elham Dolatabadi
Citations: 33
h-index: 2
Negin Baghbanzadeh
Negin Baghbanzadeh
Citations: 54
h-index: 4
Pritam Sarkar
Pritam Sarkar
Citations: 878
h-index: 12
Michael Colacci
Michael Colacci
Citations: 7
h-index: 1
Adibvafa Fallahpour
Adibvafa Fallahpour
Citations: 306
h-index: 6
Arash Afkanpour
Arash Afkanpour
Citations: 140
h-index: 6
Leonid Sigal
Leonid Sigal
Citations: 45
h-index: 3
Ali Etemad
Ali Etemad
Citations: 1,005
h-index: 15

High-stakes clinical use of large vision-language models (LVLMs) requires reasoning that is grounded in visual evidence and clinical knowledge, not just correct final answers. We introduce OpenMedReason, a large-scale, open multimodal medical reasoning corpus comprising approximately 450K image-question-answer instances whose reasoning traces are primarily derived from curated biomedical, human-authored scientific articles. OpenMedReason provides high-fidelity supervision beyond synthetic chains of thought, covering diverse medical domain vision modalities such as radiological scans, microscopic images, visible light photographs, charts, and others. We complement it with OpenMedReason-Bench, a held-out benchmark that allows fine-grained evaluation of LVLMs along three complementary axes of capability, including perception, medical knowledge, and rationale, enabling diagnostic evaluation beyond final-answer accuracy. OpenMedReason is a rich training resource that exhibits its effectiveness in both supervised fine-tuning (SFT) and reinforcement-based alignment. Training with OpenMedReason yields a 20% average improvement in VQA accuracy over the base model and achieves performance within 4.2% of the strongest comparable-scale medical LVLMs. Fine-grained performance analysis confirms that the gains are not concentrated in any single axis: OpenMedReason improves perception, medical knowledge, and rationale jointly, and its reasoning traces are preferred over those of the base model in 86.1% of pairwise comparisons. We release the code and dataset at huggingface.co/datasets/neginb/OpenMedReason.

0 Citations
0 Influential
7.5 Altmetric
37.5 Score
Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

Log in to request an AI analysis.

댓글

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!