2604.23255v1 Apr 25, 2026 cs.HC

의료 시뮬레이션에서 대화 코딩을 위한 확장 가능한 LLM 기반 방법: 코딩 성능, 처리 시간 및 환경 영향 균형

Scalable LLM-based Coding of Dialogue in Healthcare Simulation: Balancing Coding Performance, Processing Time, and Environmental Impact

Vanessa Echeverría

Citations: 977

h-index: 14

Kiyoshige Garcés

Citations: 6

h-index: 1

Linxuan Zhao

Citations: 1,450

h-index: 16

S. Samaraweera

Citations: 4

h-index: 2

Dragan Gašević

Citations: 280

h-index: 6

Roberto Martínez-Maldonado

Citations: 1,764

h-index: 22

Gloria Milena Fernández Nieto

Citations: 94

h-index: 4

연구에 따르면, 참가자들이 자신의 생각을 표현하는 상호 작용 과정인 대화는 팀 내에서 상호 이해 구축, 협력 촉진, 학습 결과 형성에 중요한 역할을 합니다. 대화 내용 분석은 팀 학습 이론 발전에 핵심적인 역할을 해왔으며, 컴퓨터 기반 협력 학습 환경 설계에 중요한 정보를 제공해 왔습니다. 그러나 이러한 발전은 노동 집약적인 질적 코딩에 의존해 왔습니다. LLM은 새로운 다중 모드 학습 분석 접근 방식 내에서 대화 분석 과정을 자동화하고 향상시킬 수 있는 가능성을 제공하며, 최근 연구에서는 LLM이 몇 가지 예시(few-shot prompting)를 통해 인간의 코딩을 모방할 수 있다는 사실이 밝혀졌습니다. 그러나 기존 연구는 연구 목적으로 인간 코딩의 정확도를 재현하는 데 초점을 맞추고 있으며, 교육적으로 더 중요한 질문, 즉 LLM이 팀 대화를 정확하고 빠르게 레이블링하여 실제 환경, 특히 결과를 신속하게 반환해야 하고 계산 비용 및 지속 가능성도 중요한 의료 시뮬레이션과 같은 환경에서 유용하게 사용할 수 있도록 프롬프트를 어떻게 설계할 수 있는지에 대한 질문을 다루지 않았습니다. 본 논문에서는 프롬프트 설계 및 배치 전략을 최적화하여 팀 기반 의료 시뮬레이션 브리핑에서 코딩 정확도, 처리 시간 및 환경 영향을 균형 있게 조절하는 방법을 연구합니다. 11,647개의 발화 데이터를 사용하여 6가지 대화 구성 요소에 대한 코딩을 수행했으며, 다양한 배치 크기를 가진 4가지 프롬프트 설계를 비교하여 코딩 성능, 처리 시간 및 에너지 소비를 평가하고 이러한 지표 간의 상호 관계를 분석했습니다. 결과는 배치 크기를 늘리면 속도가 향상되고 에너지 사용량이 줄어들지만 코딩 성능에 부정적인 영향을 미친다는 것을 보여줍니다. 본 연구는 LLM 기반 질적 분석의 실현 가능성을 입증하는 것 외에도, 적시성, 개인 정보 보호 및 지속 가능성이 중요한 환경에서 대화 분석을 확장하기 위한 실질적인 지침을 제공합니다.

Original Abstract

Research shows that dialogue, the interactive process through which participants articulate their thinking, plays a central role in constructing shared understanding, coordinating action, and shaping learning outcomes in teams. Analysing dialogue content has been central to advancing team learning theory and informing the design of computer-supported collaborative learning environments, yet this progress has depended on labour-intensive qualitative coding. LLMs offer new possibilities for automating and enhancing the dialogue layer within emerging multimodal learning analytics approaches, with recent studies showing that they can approximate human coding through few-shot prompting. However, prior work has focused on replicating human coding accuracy for research purposes, rather than addressing a more educationally consequential question: how can we design prompts that allow an LLM to label team dialogue accurately and fast enough to be useful in real settings, such as in-person healthcare simulations, where results must be returned quickly and computational cost and sustainability also matter? This paper investigates how prompt design and batching strategies can be optimised to balance coding accuracy, processing time, and environmental impact in team-based healthcare simulation debriefing. Using a dataset of 11,647 utterances coded across 6 dialogue constructs, we compared 4 prompt designs across varying batch sizes, evaluating coding performance, processing time, and energy consumption, as well as the trade-offs between these metrics. Results indicate that increasing batch size improves speed and reduces energy use, but negatively impacts coding performance. Beyond demonstrating the feasibility of LLM-based qualitative analysis, this study offers practical guidance for scaling dialogue analytics in contexts where timeliness, privacy, and sustainability are critical.

0 Citations

0 Influential

11 Altmetric

55.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!