2605.28025v1 May 27, 2026 cs.AI

MIRA: A Bilingual Benchmark for Medical Information Response Audit

C. Gao
C. Gao
Citations: 10
h-index: 2
Qianqian Wang
Qianqian Wang
Citations: 0
h-index: 0
Xiwei Dai
Xiwei Dai
Citations: 28
h-index: 3
Weiyi Wu
Weiyi Wu
Citations: 92
h-index: 7
Qiaoxin Yang
Qiaoxin Yang
Citations: 1
h-index: 1
Mengyun Xu
Mengyun Xu
Citations: 27
h-index: 4

Large language models (LLMs) are increasingly used to provide public-facing health information, yet existing safety evaluations overlook whether responses preserve comparable medical information across different user phrasings of the same question. To address this, we introduce the Medical Information Response Audit (MIRA), a bilingual, controlled benchmark that assesses whether LLMs provide comparable medical information across user-side language, register, and health literacy signals. MIRA contains 4,320 prompts built from 60 medically reviewed, low-risk health questions. Across five mainstream LLMs, models answered all medical questions, but responses to low health-literacy signals consistently omitted more key information, provided fewer concrete next steps, and offered less support for independent judgment. We term this pattern Differential Information Dilution (DID). Language effects are model-specific rather than uniformly worse for non-English prompts. A comparison with 300 real-world health queries provides preliminary evidence of rank-order validity. A knowledge-guided mitigation prompt reduces information dilution for most models, with the largest reductions in underinformative simplification observed for Claude (~8%) and Qwen (~6%).

0 Citations
0 Influential
3.5 Altmetric
17.5 Score
Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

Log in to request an AI analysis.

댓글

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!