2605.08019v1 May 08, 2026 cs.AI

놀이의 이유: 최첨단 거대 추론 모델(LRM)과 인간 학습자 간의 행동 및 뇌 활동의 일치성 연구

Reason to Play: Behavioral and Brain Alignment Between Frontier LRMs and Human Game Learners

Christopher Summerfield

Citations: 136

h-index: 7

Marcelo G. Mattar

Citations: 218

h-index: 3

Botos Csaba

Citations: 50

h-index: 4

Sreejan Kumar

Citations: 453

h-index: 9

Austin Tudor David Andrews

Citations: 1

h-index: 1

Laurence T Hunt

Citations: 9

h-index: 2

Josh Tenenbaum

Citations: 74

h-index: 6

Rui Ponte Costa

Citations: 23

h-index: 1

Momchil S. Tomov

Citations: 425

h-index: 9

인간은 새로운 환경에 직면했을 때 추상적인 지식을 빠르게 학습하고, 이 지식을 활용하여 효율적이고 지능적인 행동을 수행합니다. 현대 인공지능 시스템도 이와 유사한 방식으로 학습하고 계획할 수 있을까요? 본 연구에서는 복잡한 인간 게임 플레이 데이터와 동시에 뇌 기능 영상(fMRI)을 기록한 데이터를 사용하여 이 질문에 대한 답을 찾고자 합니다. 참가자들은 규칙 발견, 가설 수정, 다단계 계획이 필요한 새로운 비디오 게임을 학습합니다. 우리는 다양한 최첨단 거대 추론 모델(LRM), 모델 기반 및 모델 없는 딥 강화 학습 에이전트, 그리고 베이지안 이론 기반 에이전트를 비교하여, 게임 플레이 능력, 인간 학습 행동과의 일치성, 그리고 동일 작업 중 뇌 활동 예측 능력 측면에서 모델들을 종합적으로 평가합니다. 연구 결과, 최첨단 LRM은 게임 탐색 과정에서 인간의 행동 패턴과 가장 유사하며, 강화 학습 방식의 대안보다 피질 및 피질 하 영역에서 뇌 활동을 훨씬 더 정확하게 예측하는 것으로 나타났습니다. 이러한 효과는 순열 검정을 통해 검증되었습니다. 또한, 의도적인 조작을 통해 뇌 활동의 일치성은 모델의 게임 상태에 대한 문맥 내 표현을 반영하며, 다운스트림 계획 또는 추론 과정을 반영하는 것이 아님을 확인했습니다. 본 연구 결과는 LRM이 복잡하고 자연스러운 환경에서 인간의 학습 및 의사 결정에 대한 설득력 있는 계산 모델임을 보여줍니다. 관련 프로젝트 페이지 (인터랙티브 리플레이 제공): https://botcs.github.io/reason-to-play/

Original Abstract

Humans rapidly learn abstract knowledge when encountering novel environments and flexibly deploy this knowledge to guide efficient and intelligent action. Can modern AI systems learn and plan in a similar way? We study this question using a dataset of complex human gameplay with concurrent fMRI recordings, in which participants learn novel video games that require rule discovery, hypothesis revision, and multi-step planning. We jointly evaluate models by their ability to play the games, match human learning behavior, and predict brain activity during the same task, comparing a suite of frontier Large Reasoning Models (LRMs) against model-free and model-based deep reinforcement learning agents and a Bayesian theory-based agent. We find that frontier LRMs most closely match human behavioral patterns during game discovery and predict brain activity an order of magnitude better than both reinforcement learning alternatives across cortical and subcortical regions, with effects robust to permutation controls. Through targeted manipulations, we further show that brain alignment reflects the model's in-context representation of the game state rather than its downstream planning or reasoning. Our results establish LRMs as compelling computational accounts of human learning and decision making in complex, naturalistic environments. Project page with interactive replays: https://botcs.github.io/reason-to-play/

1 Citations

0 Influential

4.5 Altmetric

23.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!