2604.01594v1 Apr 02, 2026 cs.AI

대규모 언어 모델이 교육할 때, 진정으로 '이해'하는 것인가?

Do Large Language Models Mentalize When They Teach?

Ilia Sucholutsky

Citations: 1,329

h-index: 13

S. Harootonian

Citations: 114

h-index: 5

Mark K. Ho

Citations: 141

h-index: 3

Thomas L. Griffiths

Citations: 657

h-index: 11

Yael Niv

Citations: 14

h-index: 2

대규모 언어 모델(LLM)은 다음에 무엇을 가르칠지 어떻게 결정할까요? 학습자의 지식에 대한 추론을 통해 결정하는 것일까요, 아니면 더 단순한 규칙을 사용하는 것일까요? 본 연구에서는 인간의 교육 전략을 연구하는 데 사용된 통제된 과제를 통해 이를 테스트합니다. 각 시나리오에서, 교육 역할을 하는 LLM은 보상이 부여된 방향 그래프에서 학습자의 경로를 보고, 학습자가 재계획할 경우 더 나은 경로를 선택하도록 유도할 수 있는 단일 연결(edge)을 제시해야 합니다. 다양한 LLM을 시뮬레이션된 교육자로 활용하고, 각 시나리오에서의 선택을 인간에게 사용된 동일한 인지 모델로 분석합니다. 인지 모델에는 학습자가 놓치고 있는 전환(transition)을 추론하는 베이즈 최적 교육 모델, 약한 베이즈 변형 모델, 휴리스틱 기반 모델(예: 보상 기반 모델), 그리고 '이해' 능력이 없는 효용 모델 등이 포함됩니다. 인간 피험자에게 제시된 동일한 자극을 사용한 기준 실험에서, 대부분의 LLM은 우수한 성능을 보이며, 시나리오에 따른 전략의 변화가 적고, 그래프별 성능은 인간과 유사했습니다. 모델 비교(BIC) 결과, 베이즈 최적 교육 모델이 대부분의 LLM의 선택을 가장 잘 설명하는 것으로 나타났습니다. 추가적인 지침(scaffolding)을 제공했을 때, 모델들은 추가적인 추론 또는 보상 관련 지침을 따르지만, 이러한 지침이 항상 휴리스틱과 일치하지 않는 테스트 그래프에서 향후 교육 성능을 향상시키지는 못하며, 때로는 성능을 저하시키기도 합니다. 전반적으로, 인지 모델 분석은 LLM의 교육 정책에 대한 통찰력을 제공하며, 프롬프트 준수가 반드시 더 나은 교육 결정을 보장하지는 않는다는 것을 보여줍니다.

Original Abstract

How do LLMs decide what to teach next: by reasoning about a learner's knowledge, or by using simpler rules of thumb? We test this in a controlled task previously used to study human teaching strategies. On each trial, a teacher LLM sees a hypothetical learner's trajectory through a reward-annotated directed graph and must reveal a single edge so the learner would choose a better path if they replanned. We run a range of LLMs as simulated teachers and fit their trial-by-trial choices with the same cognitive models used for humans: a Bayes-Optimal teacher that infers which transitions the learner is missing (inverse planning), weaker Bayesian variants, heuristic baselines (e.g., reward based), and non-mentalizing utility models. In a baseline experiment matched to the stimuli presented to human subjects, most LLMs perform well, show little change in strategy over trials, and their graph-by-graph performance is similar to that of humans. Model comparison (BIC) shows that Bayes-Optimal teaching best explains most models' choices. When given a scaffolding intervention, models follow auxiliary inference- or reward-focused prompts, but these scaffolds do not reliably improve later teaching on heuristic-incongruent test graphs and can sometimes reduce performance. Overall, cognitive model fits provide insight into LLM tutoring policies and show that prompt compliance does not guarantee better teaching decisions.

0 Citations

0 Influential

6.5 Altmetric

32.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!