2605.26463v1 May 26, 2026 cs.CL

Towards Error-Free EHRs: Reasoning-Intensive Consistency Verification Between Clinical Notes and Structured Tables in Electronic Health Records

P. Rabaey

Citations: 61

h-index: 4

Jiho Kim

Citations: 310

h-index: 9

Yeonsu Kwon

Citations: 320

h-index: 5

Jun-Min Lee

Citations: 4

h-index: 1

Edward Choi

Citations: 8

h-index: 2

Junseong Choi

Citations: 11

h-index: 2

Sujeong Im

Citations: 108

h-index: 3

Sangji Lee

Citations: 4

h-index: 1

Hyunwook Kwon

Citations: 11

h-index: 1

J. Kim

Citations: 448

h-index: 11

Minseo Kim

Citations: 11

h-index: 1

Jeewon Yang

Citations: 92

h-index: 3

H. Yoon

Citations: 0

h-index: 0

Data consistency between unstructured clinical notes and structured tables in Electronic Health Records (EHRs) is essential for patient safety and clinical decision-making. However, existing work on note-table consistency verification mainly relies on surface-level matching of numeric values or simple events. Such approaches fail to capture the reasoning underlying real-world EHR documentation, including clinical interpretation, event relations, and temporal changes. To address this gap, we introduce EHR-ReasonCon, a reasoning-intensive benchmark for note-table consistency verification. Built on MIMIC-III with expert-guided annotations, it comprises 8,048 entities derived from clinical notes and provides high-quality ground-truth labels. The annotation protocol is supported by specialized table-exploration tools to ensure systematic evidence retrieval and reliable consistency assessment. We also propose EHR-Inspector, an LLM-based framework that segments notes, extracts anchor entities and temporal references, and uses table-exploration tools to verify consistency against structured tables. Evaluated using expert-validated LLM-as-a-judge metrics under harsh and lenient criteria, EHR-Inspector achieves state-of-the-art performance across multiple model backbones. Analyses further demonstrate the effectiveness of its components and highlight differences from human verification.

0 Citations

0 Influential

5.5 Altmetric

27.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!