2603.01438v1 Mar 02, 2026 cs.CL

역할 연기 에이전트의 디코딩 시 페르소나 준수를 동적 중요도 추정기를 통해 향상시키는 방법

Enhancing Persona Following at Decoding Time via Dynamic Importance Estimation for Role-Playing Agents

Siyuan Liu

Citations: 25

h-index: 3

Lei Zhang

Citations: 61

h-index: 4

Yuxin Liu

Citations: 9

h-index: 2

Ming Zhu

Citations: 554

h-index: 8

Boxiang Hu

Citations: 0

h-index: 0

대규모 언어 모델의 활용이 증가함에 따라 사회 연구에서 역할 연기 언어 에이전트의 유용성이 높아지고 있습니다. 사회 시뮬레이션의 현실성을 확보하기 위해서는 이러한 에이전트들이 캐릭터 프로필에 정의된 페르소나를 준수해야 하지만, 기존의 정적 프롬프트 엔지니어링이나 비용이 많이 드는 미세 조정 방식은 동적인 시나리오에 대한 페르소나 적응에 실패합니다. 인지-정서적 성격 시스템과 같은 심리학 이론은 이러한 실패의 중요한 원인을 설명합니다. 즉, 페르소나가 행동에 미치는 영향은 정적이지 않고 시나리오에 따라 달라집니다. 이러한 맥락 의존성은 적응적인 페르소나 관리가 얼마나 중요한지를 강조합니다. 이러한 격차를 해결하기 위해, 우리는 맥락에 따른 페르소나의 중요도를 동적으로 추정하고 가중치 부여된 보상 기반 디코딩에 통합하여 추론 시 페르소나 준수를 가능하게 하는 새로운, 이론 기반의 방법을 제안합니다. 구체적으로, 우리는 두 가지 핵심 구성 요소로 이루어진 페르소나 동적 디코딩(PDD) 프레임워크를 소개합니다. (1) 페르소나 중요도 추정(PIE) 모듈은 ground-truth 감독 없이 맥락적 중요도를 동적으로 정량화합니다. (2) 페르소나 기반 추론 시 정렬(PIA) 패러다임은 이러한 중요도 점수를 활용하여 가중치 부여된 다중 목표 보상을 구성하고 추론 과정에서 생성 확률을 조절합니다. 광범위한 실험 결과는 제안된 방법이 발화 일관성과 행동 충실도 측면에서 효과적임을 보여줍니다.

Original Abstract

The utility of Role-Playing Language Agents in sociological research is growing alongside the adoption of Large Language Models. For realism in social simulation, these agents must adhere to their personas defined by character profiles, yet existing strategies-static prompt engineering or costly fine-tuning-fail to adapt personas to dynamic scenarios. Psychological theories, such as the Cognitive-Affective Personality Systems, provide a crucial explanation for this failure: a persona's influence on behavior is not static but varies with the scenarios. This context-dependence highlights the critical need for adaptive persona management. To address this gap, we propose a novel, theory-driven method that dynamically estimates context-dependent persona importance and integrates it into weighted reward-guided decoding, enabling inference-time persona following. Specifically, we introduce the Persona Dynamic Decoding (PDD) framework, which consists of two key components: (1) Persona Importance Estimation (PIE) module, which dynamically quantifies the contextual importance of persona attributes without requiring ground-truth supervision; and (2) Persona-Guided Inference-Time Alignment (PIA) paradigm, which leverages these importance scores to construct weighted multi-objective rewards and modulate generation probabilities during inference. Extensive experiments show the effectiveness of our method in utterance consistency and behavioral fidelity.

0 Citations

0 Influential

4 Altmetric

20.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!