2602.21657v1 Feb 25, 2026 cs.CV

진단 추적을 따르는 시각 인지 기반 협업 네트워크: 흉부 X선 진단을 위한 시스템

Following the Diagnostic Trace: Visual Cognition-guided Cooperative Network for Chest X-Ray Diagnosis

Shaoxuan Wu

Citations: 56

h-index: 4

Jingkun Chen

Citations: 30

h-index: 3

Jun Feng

Citations: 30

h-index: 3

Cong Shen

Citations: 387

h-index: 10

Xiao Zhang

Citations: 21

h-index: 3

Chong Ma

Citations: 1,734

h-index: 20

컴퓨터 지원 진단(CAD)은 흉부 X선 진단 자동화에 크게 기여했지만, 임상 워크플로우와는 분리되어 있으며 신뢰할 수 있는 의사 결정 지원 및 해석 가능성이 부족합니다. 인간-AI 협업은 제어 가능한 방사선 전문의의 행동을 통합하여 진단 모델의 신뢰성을 향상시키는 것을 목표로 합니다. 그러나 진단 루틴에 원활하게 통합된 대화형 도구가 부족하면 협업이 어렵고, 방사선 전문의의 의사 결정 패턴과 모델 표현 간의 의미론적 격차는 임상 적용을 더욱 제한합니다. 이러한 한계를 극복하기 위해, 우리는 시각 인지 기반 협업 네트워크(VCC-Net)를 제안하여 협업 진단 패러다임을 달성합니다. VCC-Net은 시각 인지(VC)를 중심으로 하며, 아이 트래킹 또는 마우스와 같은 임상적으로 호환 가능한 인터페이스를 사용하여 방사선 전문의의 진단 과정 중 시각적 탐색 경로 및 주의 패턴을 캡처합니다. VCC-Net은 VC를 공간 인지 가이드로 활용하여 계층적 시각적 탐색 전략을 학습하고, 진단적으로 중요한 영역을 찾습니다. 이후, 인지 그래프 공동 편집 모듈은 방사선 전문의의 VC를 모델 추론과 통합하여 질병에 대한 인식을 갖춘 그래프를 생성합니다. 이 모듈은 해부학적 영역 간의 의존성을 파악하고, VC 기반 특징과 모델 표현을 정렬하여 방사선 전문의의 편향을 줄이고 상호 보완적이고 투명한 의사 결정을 용이하게 합니다. 공개 데이터셋인 SIIM-ACR, EGD-CXR 및 자체 구축한 TB-Mouse 데이터셋에 대한 실험에서 각각 88.40%, 85.05% 및 92.41%의 분류 정확도를 달성했습니다. VCC-Net에서 생성된 어텐션 맵은 방사선 전문의의 시선 분포와 높은 일관성을 보여주며, 방사선 전문의와 모델 추론 간의 상호 강화 효과를 입증합니다. 코드는 https://github.com/IPMI-NWU/VCC-Net 에서 확인할 수 있습니다.

Original Abstract

Computer-aided diagnosis (CAD) has significantly advanced automated chest X-ray diagnosis but remains isolated from clinical workflows and lacks reliable decision support and interpretability. Human-AI collaboration seeks to enhance the reliability of diagnostic models by integrating the behaviors of controllable radiologists. However, the absence of interactive tools seamlessly embedded within diagnostic routines impedes collaboration, while the semantic gap between radiologists' decision-making patterns and model representations further limits clinical adoption. To overcome these limitations, we propose a visual cognition-guided collaborative network (VCC-Net) to achieve the cooperative diagnostic paradigm. VCC-Net centers on visual cognition (VC) and employs clinically compatible interfaces, such as eye-tracking or the mouse, to capture radiologists' visual search traces and attention patterns during diagnosis. VCC-Net employs VC as a spatial cognition guide, learning hierarchical visual search strategies to localize diagnostically key regions. A cognition-graph co-editing module subsequently integrates radiologist VC with model inference to construct a disease-aware graph. The module captures dependencies among anatomical regions and aligns model representations with VC-driven features, mitigating radiologist bias and facilitating complementary, transparent decision-making. Experiments on the public datasets SIIM-ACR, EGD-CXR, and self-constructed TB-Mouse dataset achieved classification accuracies of 88.40%, 85.05%, and 92.41%, respectively. The attention maps produced by VCC-Net exhibit strong concordance with radiologists' gaze distributions, demonstrating a mutual reinforcement of radiologist and model inference. The code is available at https://github.com/IPMI-NWU/VCC-Net.

0 Citations

0 Influential

30 Altmetric

150.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!