2603.18507v1 Mar 19, 2026 cs.AI

전문가 페르소나가 LLM 정렬에는 도움이 되지만 정확도를 저하시킨다: PRISM을 이용한 의도 기반 페르소나 라우팅 방법

Expert Personas Improve LLM Alignment but Damage Accuracy: Bootstrapping Intent-Based Persona Routing with PRISM

Jesse Thomason

Citations: 36

h-index: 4

Zizhao Hu

Citations: 53

h-index: 4

Mohammad Rostami

Citations: 28

h-index: 4

페르소나 프롬프팅은 LLM의 생성 결과를 특정 분야의 어조와 패턴에 맞게 조정할 수 있습니다. 이러한 기능은 다양한 상호작용이 중요한 다중 에이전트 시스템 및 인간 중심 작업에서 고수준의 인간 정렬을 요구하는 데 유용합니다. 기존 연구들은 페르소나의 유용성에 대해 엇갈린 의견을 제시합니다. 일부 연구에서는 특정 분야에서 전문가 페르소나를 사용할 때 성능 향상과 합성 데이터 생성 시 데이터 다양성 증진에 기여한다고 보고하는 반면, 다른 연구에서는 일반적인 유용성에 거의 또는 부정적인 영향을 미친다고 보고합니다. LLM 페르소나의 장점을 최대한 활용하고 그로 인한 부작용을 피하기 위해서는 그 작동 방식에 대한 보다 포괄적인 연구가 필요합니다. 본 연구에서는 모델 최적화, 작업 유형, 프롬프트 길이, 배치 등이 명령어 튜닝 및 추론 LLM의 전문가 페르소나 효과에 미치는 영향을 분석하고, 전문가 페르소나가 실패하고 성공하는 조건에 대한 통찰력을 제공합니다. 연구 결과를 바탕으로, 외부 데이터, 모델 또는 지식이 필요 없는 부트스트래핑 프로세스를 통해 의도 기반 전문가 페르소나를 게이트된 LoRA 어댑터로 자체 증류하는 파이프라인인 PRISM(Persona Routing via Intent-based Self-Modeling)을 개발했습니다. PRISM은 모든 모델에서 생성 작업의 인간 선호도 및 안전성 정렬을 향상시키면서, 판별 작업의 정확도를 유지하며, 최소한의 메모리와 컴퓨팅 오버헤드를 사용합니다.

Original Abstract

Persona prompting can steer LLM generation towards a domain-specific tone and pattern. This behavior enables use cases in multi-agent systems where diverse interactions are crucial and human-centered tasks require high-level human alignment. Prior works provide mixed opinions on their utility: some report performance gains when using expert personas for certain domains and their contribution to data diversity in synthetic data creation, while others find near-zero or negative impact on general utility. To fully leverage the benefits of the LLM persona and avoid its harmfulness, a more comprehensive investigation of the mechanism is crucial. In this work, we study how model optimization, task type, prompt length, and placement can impact expert persona effectiveness across instruction-tuned and reasoning LLMs, and provide insight into conditions under which expert personas fail and succeed. Based on our findings, we developed a pipeline to fully leverage the benefits of an expert persona, named PRISM (Persona Routing via Intent-based Self-Modeling), which self-distills an intent-conditioned expert persona into a gated LoRA adapter through a bootstrapping process that requires no external data, models, or knowledge. PRISM enhances human preference and safety alignment on generative tasks while maintaining accuracy on discriminative tasks across all models, with minimal memory and computing overhead.

4 Citations

0 Influential

2 Altmetric

14.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!