2602.07414v1 Feb 07, 2026 cs.AI

LLM은 인간의 성격을 진정으로 구현할 수 있는가? 분쟁 해결에서의 AI와 인간 행동 정렬 분석

Can LLMs Truly Embody Human Personality? Analyzing AI and Human Behavior Alignment in Dispute Resolution

Deuksin Kwon

Citations: 64

h-index: 4

Kaleen Shrestha

Citations: 12

h-index: 2

Bin Han

Citations: 21

h-index: 3

Spencer Lin

Citations: 9

h-index: 2

James Hale

Citations: 45

h-index: 4

Jonathan Gratch

Citations: 87

h-index: 6

Gale M. Lucas

Citations: 7,576

h-index: 40

Maja Matari'c

Citations: 236

h-index: 2

거대언어모델(LLM)은 법적 조정, 협상, 분쟁 해결과 같은 사회적 환경에서 인간의 행동을 시뮬레이션하는 데 점점 더 많이 사용되고 있다. 그러나 이러한 시뮬레이션이 인간에게서 관찰되는 성격-행동 패턴을 재현하는지 여부는 여전히 불분명하다. 예를 들어, 인간의 성격은 감정이 고조된 상호작용에서의 전략적 선택과 행동을 포함하여 개인이 사회적 상호작용을 수행하는 방식을 형성한다. 이는 성격 특성이 프롬프트로 주어졌을 때, LLM이 인간의 갈등 행동에서 나타나는 성격에 따른 차이를 재현할 수 있는가 하는 질문을 제기한다. 이를 탐구하기 위해, 우리는 5대 성격 특성(BFI)과 관련하여 분쟁 해결 대화에서 인간-인간 및 LLM-LLM 행동을 직접 비교할 수 있는 평가 프레임워크를 소개한다. 이 프레임워크는 전략적 행동 및 갈등 결과와 관련된 해석 가능한 일련의 지표를 제공한다. 또한 우리는 인간 대화와 관련된 시나리오 및 성격 특성을 매칭하여 LLM 분쟁 해결 대화를 생성하는 새로운 데이터셋 구축 방법론을 제시한다. 마지막으로, 우리는 세 가지 최신 폐쇄형 소스(closed-source) LLM에 이 평가 프레임워크를 적용하여 시연하고, 인간 데이터와 비교했을 때 LLM마다 갈등 상황에서 성격이 발현되는 방식에 상당한 괴리가 있음을 보여준다. 이는 성격이 프롬프트된 에이전트가 사회적으로 영향력 있는 응용 분야에서 신뢰할 수 있는 행동 대리인 역할을 할 수 있다는 가정에 이의를 제기한다. 우리의 연구는 실세계에 적용하기 전, AI 시뮬레이션에 대한 심리학적 근거 마련 및 검증의 필요성을 강조한다.

Original Abstract

Large language models (LLMs) are increasingly used to simulate human behavior in social settings such as legal mediation, negotiation, and dispute resolution. However, it remains unclear whether these simulations reproduce the personality-behavior patterns observed in humans. Human personality, for instance, shapes how individuals navigate social interactions, including strategic choices and behaviors in emotionally charged interactions. This raises the question: Can LLMs, when prompted with personality traits, reproduce personality-driven differences in human conflict behavior? To explore this, we introduce an evaluation framework that enables direct comparison of human-human and LLM-LLM behaviors in dispute resolution dialogues with respect to Big Five Inventory (BFI) personality traits. This framework provides a set of interpretable metrics related to strategic behavior and conflict outcomes. We additionally contribute a novel dataset creation methodology for LLM dispute resolution dialogues with matched scenarios and personality traits with respect to human conversations. Finally, we demonstrate the use of our evaluation framework with three contemporary closed-source LLMs and show significant divergences in how personality manifests in conflict across different LLMs compared to human data, challenging the assumption that personality-prompted agents can serve as reliable behavioral proxies in socially impactful applications. Our work highlights the need for psychological grounding and validation in AI simulations before real-world use.

1 Citations

0 Influential

20 Altmetric

101.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!