2603.12071v1 Mar 12, 2026 cs.CV

LoV3D: 3차원 뇌 MRI의 영역별 부피 평가를 통한 인지적 예측 추론의 기반 마련

Paper Title: LoV3D: Grounding Cognitive Prognosis Reasoning in Longitudinal 3D Brain MRI via Regional Volume Assessments

Zhizhong Fu

Citations: 13

h-index: 2

David McAllister

Citations: 208

h-index: 4

Honghan Wu

Citations: 4

h-index: 2

Zhaoyang Jiang

Citations: 8

h-index: 2

Y. Kim

Citations: 219

h-index: 7

장기간 뇌 MRI는 알츠하이머병과 같은 신경 질환의 진행 과정을 파악하는 데 필수적입니다. 그러나 현재의 딥러닝 도구들은 이 과정을 단편적으로 처리합니다. 분류기는 이미지를 레이블로 축소하고, 부피 측정 파이프라인은 해석되지 않은 측정값을 생성하며, 시각-언어 모델(VLM)은 유창하지만 잠재적으로 환각적인 결론을 생성할 수 있습니다. 본 논문에서는 LoV3D라는 3차원 시각-언어 모델 학습 파이프라인을 소개합니다. LoV3D는 장기간의 T1-가중 뇌 MRI를 입력으로 받아, 영역 수준의 해부학적 평가를 수행하고, 이전 스캔과 장기간 비교를 진행하며, 최종적으로 세 가지 진단 범주(인지 기능 정상, 경도인지장애, 치매)를 제시하고, 합성된 진단 요약을 제공합니다. 이 단계별 파이프라인은 레이블 일관성, 장기간 일관성, 생물학적 타당성을 통해 최종 진단을 뒷받침하여 환각의 위험을 줄입니다. 학습 과정에서 임상적으로 가중된 검증기(Verifier)를 도입하여 표준화된 부피 지표에서 파생된 기준과 후보 출력 결과를 자동으로 비교하고, 인간 주석 없이 직접 선호도 최적화(Direct Preference Optimization)를 수행합니다. ADNI 테스트 세트(479개의 스캔, 258명의 피험자)를 사용하여 LoV3D는 세 가지 진단 범주에 대한 93.7%의 정확도(+44.8% 대비 기준 모델), 두 가지 진단 범주에 대한 97.2%의 정확도(+4% 대비 최고 성능 모델), 그리고 영역 수준의 해부학적 분류에 대한 82.6%의 정확도(+33.1% 대비 VLM 기준 모델)를 달성했습니다. 제로샷 전이 실험에서는 MIRIAD 데이터셋에서 95.4%의 정확도(치매 환자 재현율 100%)와 AIBL 데이터셋에서 82.9%의 세 가지 진단 범주 정확도를 보여주며, 다양한 기관, 스캐너 및 인구 집단에 대한 높은 일반화 성능을 확인했습니다. 코드 및 관련 자료는 https://github.com/Anonymous-TEVC/LoV-3D 에서 확인할 수 있습니다.

Original Abstract

Longitudinal brain MRI is essential for characterizing the progression of neurological diseases such as Alzheimer's disease assessment. However, current deep-learning tools fragment this process: classifiers reduce a scan to a label, volumetric pipelines produce uninterpreted measurements, and vision-language models (VLMs) may generate fluent but potentially hallucinated conclusions. We present LoV3D, a pipeline for training 3D vision-language models, which reads longitudinal T1-weighted brain MRI, produces a region-level anatomical assessment, conducts longitudinal comparison with the prior scan, and finally outputs a three-class diagnosis (Cognitively Normal, Mild Cognitive Impairment, or Dementia) along with a synthesized diagnostic summary. The stepped pipeline grounds the final diagnosis by enforcing label consistency, longitudinal coherence, and biological plausibility, thereby reducing the risks of hallucinations. The training process introduces a clinically-weighted Verifier that scores candidate outputs automatically against normative references derived from standardized volume metrics, driving Direct Preference Optimization without a single human annotation. On a subject-level held-out ADNI test set (479 scans, 258 subjects), LoV3D achieves 93.7% three-class diagnostic accuracy (+34.8% over the no-grounding baseline), 97.2% on two-class diagnosis accuracy (+4% over the SOTA) and 82.6% region-level anatomical classification accuracy (+33.1% over VLM baselines). Zero-shot transfer yields 95.4% on MIRIAD (100% Dementia recall) and 82.9% three-class accuracy on AIBL, confirming high generalizability across sites, scanners, and populations. Code is available at https://github.com/Anonymous-TEVC/LoV-3D.

0 Citations

0 Influential

23.5 Altmetric

117.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!