2604.19060v1 Apr 21, 2026 cs.AI

강화 학습이 LLM의 정확성과 추론 능력을 향상시켜 방사선 보고서 기반 질병 분류의 정확도를 높인다

Reinforcement Learning Improves LLM Accuracy and Reasoning in Disease Classification from Radiology Reports

Yi Lin

Citations: 3

h-index: 1

Yishu Wei

Citations: 98

h-index: 5

Adam E. Flanders

Citations: 223

h-index: 9

G. Shih

Citations: 308

h-index: 10

Yifan Peng

Citations: 11

h-index: 2

방사선 보고서로부터 정확한 질병 분류는 다양한 응용 분야에서 필수적입니다. 경량 LLM의 지도 학습 미세 조정(SFT)은 정확도를 향상시키지만, 추론 능력을 저하시킬 수 있습니다. 본 연구에서는 두 단계 접근 방식을 제안합니다. 먼저 질병 레이블에 대한 SFT를 수행한 후, 정확도와 형식 최적화를 통해 예측을 개선하고 추론의 도움 없이 결과를 다듬는 그룹 상대 정책 최적화(GRPO)를 적용합니다. 세 개의 방사선 전문의가 주석을 단 데이터 세트에 대해 SFT는 기존 방법보다 우수한 성능을 보였으며, GRPO는 분류 성능을 더욱 향상시키고 추론의 재현율과 포괄성을 높였습니다.

Original Abstract

Accurate disease classification from radiology reports is essential for many applications. While supervised fine-tuning (SFT) of lightweight LLMs improves accuracy, it can degrade reasoning. We propose a two-stage approach: SFT on disease labels followed by Group Relative Policy Optimization (GRPO) to refine predictions by optimizing accuracy and format without reasoning supervision. Across three radiologist-annotated datasets, SFT outperformed baselines and GRPO further improved classification and enhanced reasoning recall and comprehensiveness.

0 Citations

0 Influential

5 Altmetric

25.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!