2605.05706v1 May 07, 2026 cs.AI

개인 맞춤 의료를 위한 확률적 인과적 표현 학습을 통한 편향-정확도 역설 해결

Resolving the bias-precision paradox with stochastic causal representation learning for personalized medicine

K. Ngiam

Citations: 4,037

h-index: 27

Ching-Yu Cheng

Citations: 49

h-index: 3

Nan Liu

Citations: 83

h-index: 4

Pei Zhang

Citations: 1

h-index: 1

Manqiang Peng

Citations: 50

h-index: 5

Yuxuan Wu

Citations: 9

h-index: 2

P. Phadungsaksawasdi

Citations: 12

h-index: 1

Wesley Yeung

Citations: 20

h-index: 2

Trang Nguyen

Citations: 31

h-index: 3

Qiang Zhang

Citations: 8

h-index: 2

Meng Wang

Citations: 59

h-index: 3

Y. Tham

Citations: 23,914

h-index: 56

R. Ke

Citations: 87

h-index: 3

Wenzhuo Yang

Citations: 78

h-index: 4

Zheng Lu

Citations: 210

h-index: 4

Yu Zhang

Citations: 2

h-index: 1

Sheng Zhong

Citations: 29

h-index: 3

Hao Deng

Citations: 65

h-index: 5

Dianbo Liu

Citations: 67

h-index: 3

Ye-Wang Zhang

Citations: 167

h-index: 7

Qingyun Chen

Citations: 15

h-index: 3

Changlan Li

Citations: 9

h-index: 1

Chun-Huang Lai

Citations: 2

h-index: 1

Tianfan Fu

Nanjing University

Citations: 4,989

h-index: 24

개인별 치료 효과를 종단적 관찰 데이터로부터 추정하는 것은 데이터 기반 의학의 핵심이지만, 기존 방법은 근본적인 한계에 직면합니다. 즉, 교란 변수 편향을 줄이는 과정에서 임상적으로 중요한 이질성이 억제되어 환자별 예측 성능이 저하되는 현상이 발생합니다. 본 연구에서는 이러한 현상을 인과적 표현 학습에서의 편향-정확도 역설로 규정하고, 전체적인 적대적 균형을 부분집합 수준의 매칭으로 대체하는 확률적 정렬 전략인 샘플링 기반 최대 평균 차이(sMMD)를 제안합니다. 제안하는 방법은 귀납적 결과 예측 프레임워크에 구현되었으며, 설명 가능성을 위한 변수 중요도 기반 해석 기능을 제공합니다. 두 개의 대규모 중환자실 코호트(n = 27,783)에 대한 실험 결과, 제안하는 프레임워크는 데이터 분포 변화에 대한 정확도를 향상시키고, 오류를 최대 11.5%까지 줄이며, 고위험 작업에서의 재현율을 크게 증가시켰습니다. 메커니즘 분석 결과, sMMD는 임상적으로 중요한 변수들을 선택적으로 보존하는 것으로 나타났습니다. 인간-AI 평가 결과, 제안하는 방법은 임상 수련생 및 대규모 언어 모델보다 우수한 성능을 보였으며, 임상의의 정확도를 14.7% 향상시키고 의사 결정 시간을 단축하여 해석 가능하고 실시간적인 임상 의사 결정 지원을 가능하게 합니다.

Original Abstract

Estimating individualized treatment effects from longitudinal observational data is central to data-driven medicine, yet existing methods face a fundamental limitation: reducing confounding bias often suppresses clinically informative heterogeneity, degrading patient-specific predictions. Here, we identify this tension as a bias-precision paradox in causal representation learning and introduce sampling-based maximum mean discrepancy (sMMD), a stochastic alignment strategy that replaces global adversarial balancing with subset-level matching. We instantiate this approach in a framework for counterfactual outcome prediction with attribution-grounded interpretability. Across two large-scale ICU cohorts (n = 27,783), our framework improves accuracy under distribution shift, reducing error by up to 11.5% and substantially increasing recall in high-risk tasks. Mechanistic analyses show that sMMD selectively preserves clinically decisive variables. In human-AI evaluation, our method outperforms clinicians-in-training and large language models, and improves clinician accuracy by 14.7% while reducing decision time, enabling interpretable, real-time clinical decision support.

0 Citations

0 Influential

28 Altmetric

140.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!