2603.27460v1 Mar 29, 2026 cs.CV

프로젝트 Imaging-X: 기초 모델 개발을 위한 1000개 이상의 공개 의료 영상 데이터셋에 대한 조사

Project Imaging-X: A Survey of 1000+ Open-Access Medical Imaging Datasets for Foundation Model Development

Jizhong Han
Jizhong Han
Citations: 0
h-index: 0
Xiaohong Liu
Xiaohong Liu
Citations: 1,612
h-index: 24
S. Yeung-Levy
S. Yeung-Levy
Citations: 1,408
h-index: 21
Mianxin Liu
Mianxin Liu
Citations: 94
h-index: 6
Yulong Li
Yulong Li
Citations: 2
h-index: 1
Chenglong Ma
Chenglong Ma
Citations: 145
h-index: 6
Angelica I. Avilés-Rivero
Angelica I. Avilés-Rivero
Citations: 2,511
h-index: 23
Conghui He
Conghui He
Citations: 11,631
h-index: 33
Haodong Duan
Haodong Duan
Citations: 59
h-index: 4
Bohan Zhuang
Bohan Zhuang
Citations: 5,902
h-index: 36
Haolin Yang
Haolin Yang
Citations: 31
h-index: 3
Jiancheng Yang
Jiancheng Yang
Citations: 74
h-index: 1
Zongwei Zhou
Zongwei Zhou
Citations: 627
h-index: 14
Minghui Zhang
Minghui Zhang
Citations: 437
h-index: 13
Shaohao Rui
Shaohao Rui
Citations: 45
h-index: 4
Lequan Yu
Lequan Yu
Citations: 5
h-index: 1
Shujun Wang
Shujun Wang
Citations: 46
h-index: 3
Benyou Wang
Benyou Wang
Citations: 143
h-index: 3
Ye Luo
Ye Luo
Citations: 0
h-index: 0
Wenqi Shao
Wenqi Shao
Citations: 4,489
h-index: 21
Yu Qiao
Yu Qiao
Citations: 1,411
h-index: 19
Zhongying Deng
Zhongying Deng
Citations: 73
h-index: 3
Cheng Tang
Cheng Tang
Citations: 132
h-index: 5
Ziyan Huang
Ziyan Huang
Citations: 1,102
h-index: 16
Jiashi Lin
Jiashi Lin
Citations: 26
h-index: 2
Ying Chen
Ying Chen
Citations: 140
h-index: 6
Jiyao Liu
Jiyao Liu
Citations: 162
h-index: 7
Wei Li
Wei Li
Citations: 170
h-index: 5
Yinghao Zhu
Yinghao Zhu
Citations: 31
h-index: 4
Shujian Gao
Shujian Gao
Citations: 31
h-index: 3
Yanyan Huang
Yanyan Huang
Citations: 21
h-index: 2
Sibo Ju
Sibo Ju
Citations: 344
h-index: 3
Yanzhou Su
Yanzhou Su
Citations: 5
h-index: 1
Pengcheng Chen
Pengcheng Chen
Citations: 230
h-index: 6
Wenhao Tang
Wenhao Tang
Citations: 14
h-index: 2
Tian‐Hao Li
Tian‐Hao Li
Citations: 38
h-index: 4
Haoyu Wang
Haoyu Wang
Citations: 392
h-index: 8
Yuanfeng Ji
Yuanfeng Ji
Citations: 137
h-index: 5
Hui Sun
Hui Sun
Citations: 6
h-index: 2
Shaobo Min
Shaobo Min
Citations: 717
h-index: 15
Liangchen Peng
Liangchen Peng
Citations: 20
h-index: 3
Feilong Tang
Feilong Tang
Citations: 354
h-index: 10
Haochen Xue
Haochen Xue
Citations: 44
h-index: 3
Rulin Zhou
Rulin Zhou
Citations: 9
h-index: 2
Chaoyang Zhang
Chaoyang Zhang
Citations: 28
h-index: 3
Wenjie Li
Wenjie Li
Citations: 159
h-index: 8
Wei Ma
Wei Ma
Citations: 1
h-index: 1
Xingyue Zhao
Xingyue Zhao
Citations: 67
h-index: 5
Yibin Wang
Yibin Wang
Citations: 257
h-index: 5
Kun Yuan
Kun Yuan
Citations: 54
h-index: 2
Zhaohui Lu
Zhaohui Lu
Citations: 10
h-index: 2
Jinjie Wei
Jinjie Wei
Citations: 181
h-index: 8
Lihao Liu
Lihao Liu
Citations: 101
h-index: 6
Di Yang
Di Yang
Citations: 46
h-index: 2
Lin Wang
Lin Wang
Citations: 8
h-index: 2
Yi Shen
Yi Shen
Citations: 24
h-index: 2
Xiaowei Hu
Xiaowei Hu
Citations: 47
h-index: 4
Yun Gu
Yun Gu
Citations: 33
h-index: 3
Yicheng Wu
Yicheng Wu
Citations: 13
h-index: 2
Qi Gao
Qi Gao
Citations: 2
h-index: 1
Hongming Shan
Hongming Shan
Citations: 28
h-index: 3
Xiaoyu Ren
Xiaoyu Ren
Citations: 8
h-index: 1
Fang Yan
Fang Yan
Citations: 36
h-index: 2
Hongyu Zhou
Hongyu Zhou
Citations: 358
h-index: 10
Maosong Cao
Maosong Cao
Citations: 932
h-index: 9
Shan Wang
Shan Wang
Citations: 11
h-index: 2
Bin Fu
Bin Fu
Citations: 214
h-index: 8
Xiaomeng Li
Xiaomeng Li
Citations: 17
h-index: 1
Zhi-Yan Hou
Zhi-Yan Hou
Citations: 287
h-index: 1
Chunfeng Song
Chunfeng Song
Citations: 121
h-index: 6
Lei Bai
Lei Bai
Citations: 89
h-index: 5
Yuan Cheng
Yuan Cheng
Citations: 234
h-index: 9
Yuandong Pu
Yuandong Pu
Citations: 186
h-index: 7
Xiang Li
Xiang Li
Citations: 10
h-index: 2
Wenhai Wang
Wenhai Wang
Citations: 32
h-index: 3
Hao Chen
Hao Chen
Citations: 1
h-index: 1
Jiaxin Zhuang
Jiaxin Zhuang
Citations: 241
h-index: 7
Songyang Zhang
Songyang Zhang
Citations: 96
h-index: 4
H. He
H. He
Citations: 7
h-index: 2
Meng Li
Meng Li
Peking University
Citations: 26,570
h-index: 65
Zhian Bai
Zhian Bai
Citations: 72
h-index: 3
Rongshan Yu
Rongshan Yu
Citations: 67
h-index: 3
Liansheng Wang
Liansheng Wang
Citations: 57
h-index: 2
Xiaosong Wang
Xiaosong Wang
Citations: 27
h-index: 3
Xin Guo
Xin Guo
Citations: 30
h-index: 3
Guanbin Li
Guanbin Li
Citations: 20
h-index: 2
Xiangru Lin
Xiangru Lin
Citations: 749
h-index: 11
Dakai Jin
Dakai Jin
Citations: 135
h-index: 4
Wenlong Zhang
Wenlong Zhang
Citations: 41
h-index: 4
Qiaoli Qin
Qiaoli Qin
Citations: 4
h-index: 1
Yuqiang Li
Yuqiang Li
Citations: 10
h-index: 2
Nanqing Dong
Nanqing Dong
Citations: 8
h-index: 2
Jie Xu
Jie Xu
Citations: 29
h-index: 3
Bo Zhang
Bo Zhang
Citations: 139
h-index: 4
Q. Yan
Q. Yan
Citations: 22
h-index: 2
Yihao Liu
Yihao Liu
Citations: 123
h-index: 2
Junying Ma
Junying Ma
Citations: 502
h-index: 3
Zhi Lu
Zhi Lu
Citations: 1,641
h-index: 9
Yuewen Cao
Yuewen Cao
Citations: 137
h-index: 6
Jianming Liang
Jianming Liang
Citations: 25
h-index: 2
Shixiang Tang
Shixiang Tang
Citations: 30
h-index: 2
Qi Duan
Qi Duan
Citations: 257
h-index: 7
Dong Zhou
Dong Zhou
Citations: 25
h-index: 3
Chen Jiang
Chen Jiang
Citations: 17
h-index: 1
Yuyin Zhou
Yuyin Zhou
Citations: 96
h-index: 2
Yanwu Xu
Yanwu Xu
Citations: 11
h-index: 2
Shao-Bing Zhang
Shao-Bing Zhang
Citations: 22
h-index: 3
S. Luo
S. Luo
Citations: 12
h-index: 2
Yi Xin
Yi Xin
Citations: 275
h-index: 7
Chao Liu
Chao Liu
Citations: 9
h-index: 1
Hao Wen
Hao Wen
Citations: 11
h-index: 1
Xin Chen
Xin Chen
Citations: 9
h-index: 2
A. Lozano
A. Lozano
Citations: 5
h-index: 2
Mingrui Sun
Mingrui Sun
Citations: 0
h-index: 0
Yuhui Zhang
Yuhui Zhang
Citations: 44
h-index: 4
Yue Yao
Yue Yao
Citations: 18
h-index: 3
Xiao-Xiao Sun
Xiao-Xiao Sun
Citations: 274
h-index: 9
Xia Li
Xia Li
Citations: 19
h-index: 1
Jing Ke
Jing Ke
Citations: 81
h-index: 6
Chunhui Zhang
Chunhui Zhang
Citations: 220
h-index: 9
Zongyuan Ge
Zongyuan Ge
Citations: 110
h-index: 4
M. Hu
M. Hu
Citations: 5
h-index: 1
Jin Ye
Jin Ye
Citations: 39
h-index: 4
Zhifeng Li
Zhifeng Li
Citations: 8
h-index: 1
Yirong Chen
Yirong Chen
Citations: 15
h-index: 2
Junjun He
Junjun He
Citations: 29
h-index: 3
Yukun Zhou
Yukun Zhou
Citations: 100
h-index: 5

기초 모델은 다양한 분야와 작업에서 놀라운 성공을 거두었으며, 이는 주로 대규모, 다양하고 고품질 데이터셋의 발전에 힘입은 결과입니다. 그러나 의료 영상 분야에서는 임상 전문 지식에 대한 의존성과 엄격한 윤리적 및 개인 정보 보호 제약으로 인해 이러한 의료 데이터셋을 큐레이션하고 조립하는 것이 매우 어렵습니다. 이는 대규모의 통합된 의료 데이터셋의 부족으로 이어져 강력한 의료 기초 모델 개발을 저해합니다. 본 연구에서는 지금까지 가장 큰 규모의 의료 영상 데이터셋 조사 결과를 제시하며, 1000개 이상의 공개 데이터셋을 체계적으로 분류하여 각 데이터셋의 모달리티, 작업, 해부학적 부위, 어노테이션, 제한 사항 및 통합 가능성을 상세히 기술합니다. 분석 결과, 현재의 의료 영상 데이터셋은 규모가 작고, 특정 작업에 한정되어 있으며, 장기와 모달리티에 따라 불균등하게 분포되어 있어, 다재다능하고 강력한 의료 기초 모델을 개발하는 데 한계가 있음을 보여줍니다. 이러한 단편화를 극복하고 규모를 확대하기 위해, 우리는 공통 모달리티 또는 작업을 공유하는 공개 데이터셋을 통합하는 메타데이터 기반 융합 패러다임(MDFP)을 제안합니다. MDFP를 기반으로, 우리는 엔드투엔드 자동 의료 영상 데이터셋 통합을 가능하게 하는 대화형 검색 포털을 출시하고, 조사된 모든 데이터셋을 통합된 구조화된 테이블로 정리하여 주요 특징을 명확하게 요약하고 참조 링크를 제공함으로써, 커뮤니티에 접근 가능하고 포괄적인 저장소를 제공합니다. 본 연구는 현재의 상황을 분석하고 데이터셋 통합을 위한 체계적인 방법을 제시함으로써, 의료 영상 데이터 코퍼스를 확장하고 데이터 검색을 가속화하며, 보다 체계적인 데이터셋 생성과 더욱 강력한 의료 기초 모델 개발을 지원하는 실질적인 로드맵을 제시합니다.

Original Abstract

Foundation models have demonstrated remarkable success across diverse domains and tasks, primarily due to the thrive of large-scale, diverse, and high-quality datasets. However, in the field of medical imaging, the curation and assembling of such medical datasets are highly challenging due to the reliance on clinical expertise and strict ethical and privacy constraints, resulting in a scarcity of large-scale unified medical datasets and hindering the development of powerful medical foundation models. In this work, we present the largest survey to date of medical image datasets, covering over 1,000 open-access datasets with a systematic catalog of their modalities, tasks, anatomies, annotations, limitations, and potential for integration. Our analysis exposes a landscape that is modest in scale, fragmented across narrowly scoped tasks, and unevenly distributed across organs and modalities, which in turn limits the utility of existing medical image datasets for developing versatile and robust medical foundation models. To turn fragmentation into scale, we propose a metadata-driven fusion paradigm (MDFP) that integrates public datasets with shared modalities or tasks, thereby transforming multiple small data silos into larger, more coherent resources. Building on MDFP, we release an interactive discovery portal that enables end-to-end, automated medical image dataset integration, and compile all surveyed datasets into a unified, structured table that clearly summarizes their key characteristics and provides reference links, offering the community an accessible and comprehensive repository. By charting the current terrain and offering a principled path to dataset consolidation, our survey provides a practical roadmap for scaling medical imaging corpora, supporting faster data discovery, more principled dataset creation, and more capable medical foundation models.

0 Citations
0 Influential
30 Altmetric
150.0 Score
Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

Log in to request an AI analysis.

댓글

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!