2604.15456v1 Apr 16, 2026 cs.AI

DeepER-Med: 능동적인 인공지능을 활용한 의학 분야의 심층적이고 증거 기반 연구 발전

DeepER-Med: Advancing Deep Evidence-Based Research in Medicine Through Agentic AI

Zhizheng Wang

Citations: 334

h-index: 8

Robert Leaman

Citations: 657

h-index: 13

Lauren J. He

Citations: 32

h-index: 4

Nicholas Wan

Citations: 49

h-index: 4

Joey Chan

Citations: 24

h-index: 2

Zhiyong Lu

Citations: 4,994

h-index: 12

Chih-Hsuan Wei

Citations: 6,060

h-index: 35

Chi-Ping Day

Citations: 69

h-index: 2

Chuanhui Wu

Citations: 25

h-index: 3

M.A. Knepper

Citations: 14

h-index: 2

Antolin Serrano Farias

Citations: 0

h-index: 0

Jordina Rincon-Torroella

Citations: 1,475

h-index: 19

Hasan Slika

Citations: 716

h-index: 11

Betty Tyler

Citations: 768

h-index: 10

R. Nguyen

Citations: 0

h-index: 0

Asmita Indurkar

Citations: 31

h-index: 4

M'elanie H'ebert

Citations: 0

h-index: 0

Shubo Tian

Citations: 82

h-index: 4

N. Naffakh

Citations: 12

h-index: 2

Aseem Aseem

Citations: 4

h-index: 1

Emily Y. Chew

Citations: 240

h-index: 11

T. Keenan

Citations: 5,136

h-index: 40

의료 및 생의학 연구 분야에서 인공지능(AI)의 임상 적용을 위해서는 신뢰성과 투명성이 필수적입니다. 최근의 심층 연구 시스템은 AI 에이전트와 다중 홉 정보 검색, 추론 및 통합을 통해 증거 기반의 과학적 발견을 가속화하는 것을 목표로 합니다. 그러나 대부분의 기존 시스템은 증거 평가에 대한 명시적이고 검증 가능한 기준이 부족하여 오류가 누적될 위험이 있으며, 연구자 및 임상의가 시스템의 출력 결과를 평가하기 어렵게 만듭니다. 동시에, 현재의 벤치마킹 방법은 복잡하고 실제적인 의학적 질문에 대한 성능 평가를 거의 수행하지 않습니다. 본 논문에서는 능동적인 AI 시스템을 갖춘 심층적이고 증거 기반의 의학 연구 프레임워크인 DeepER-Med를 소개합니다. DeepER-Med는 심층적인 의학 연구를 명시적이고 검증 가능한 증거 기반 생성 워크플로우로 구성하며, 세 가지 모듈로 구성됩니다. 즉, 연구 계획, 능동적인 협업 및 증거 종합입니다. 실제적인 평가를 지원하기 위해, 본 논문에서는 DeepER-MedQA를 소개합니다. DeepER-MedQA는 실제 의학 연구 시나리오에서 파생된 100개의 전문가 수준의 연구 질문으로 구성된 증거 기반 데이터 세트로, 11명의 다학제 분야 생의학 전문가 패널에 의해 큐레이션되었습니다. 전문가의 수동 평가 결과, DeepER-Med는 새로운 과학적 통찰력 생성 등 다양한 기준에서 널리 사용되는 상용 플랫폼보다 일관되게 우수한 성능을 보였습니다. 또한, DeepER-Med의 실용적인 유용성을 8개의 실제 임상 사례를 통해 입증했습니다. 임상의의 평가 결과, DeepER-Med의 결론이 7개의 사례에서 임상 권장 사항과 일치했으며, 이는 의학 연구 및 의사 결정 지원에 대한 잠재력을 보여줍니다.

Original Abstract

Trustworthiness and transparency are essential for the clinical adoption of artificial intelligence (AI) in healthcare and biomedical research. Recent deep research systems aim to accelerate evidence-grounded scientific discovery by integrating AI agents with multi-hop information retrieval, reasoning, and synthesis. However, most existing systems lack explicit and inspectable criteria for evidence appraisal, creating a risk of compounding errors and making it difficult for researchers and clinicians to assess the reliability of their outputs. In parallel, current benchmarking approaches rarely evaluate performance on complex, real-world medical questions. Here, we introduce DeepER-Med, a Deep Evidence-based Research framework for Medicine with an agentic AI system. DeepER-Med frames deep medical research as an explicit and inspectable workflow of evidence-based generation, consisting of three modules: research planning, agentic collaboration, and evidence synthesis. To support realistic evaluation, we also present DeepER-MedQA, an evidence-grounded dataset comprising 100 expert-level research questions derived from authentic medical research scenarios and curated by a multidisciplinary panel of 11 biomedical experts. Expert manual evaluation demonstrates that DeepER-Med consistently outperforms widely used production-grade platforms across multiple criteria, including the generation of novel scientific insights. We further demonstrate the practical utility of DeepER-Med through eight real-world clinical cases. Human clinician assessment indicates that DeepER-Med's conclusions align with clinical recommendations in seven cases, highlighting its potential for medical research and decision support.

0 Citations

0 Influential

20 Altmetric

100.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!