2602.05115v1 Feb 04, 2026 cs.AI

SocialVeil: 의사소통 장벽 상황에서의 언어 에이전트 사회적 지능 탐구

SocialVeil: Probing Social Intelligence of Language Agents under Communication Barriers

Tal August

Allen Institute for AI

Citations: 1,297

h-index: 18

Keyang Xuan

Citations: 98

h-index: 4

Jiaxuan You

Citations: 117

h-index: 6

Pengda Wang

Citations: 70

h-index: 3

Chongrui Ye

Citations: 3

h-index: 1

Haofei Yu

Citations: 136

h-index: 7

대규모 언어 모델(LLM)은 사회적 지능을 시험하기 위해 상호작용적 환경에서 평가받는 경우가 점차 늘어나고 있다. 그러나 기존 벤치마크들은 종종 에이전트 간의 이상적인 의사소통을 가정하기 때문에, LLM이 보다 현실적이고 불완전한 환경에서 상호작용을 유지하고 복구할 수 있는지를 진단하는 데 한계가 있다. 이러한 격차를 해소하기 위해, 우리는 인지적 차이로 인해 발생하는 의사소통 장벽 하에서의 사회적 상호작용을 시뮬레이션할 수 있는 사회적 학습 환경인 SocialVeil을 제안한다. 인간 상호작용에서의 의사소통 난제에 대한 체계적인 문헌 검토를 바탕으로, SocialVeil은 '의미적 모호성', '사회문화적 불일치', '정서적 간섭'이라는 세 가지 대표적인 방해 요소를 도입한다. 또한 손상된 의사소통 상황에서의 상호작용 품질을 평가하기 위해, '해소되지 않은 혼란'과 '상호 이해'라는 두 가지 장벽 인식 평가 지표를 소개한다. 720개의 시나리오와 4개의 최신 LLM을 대상으로 한 실험 결과, 장벽은 일관되게 성능을 저하시켰으며, 평균적으로 상호 이해도는 45% 이상 감소하고 혼란은 50% 가까이 증가한 것으로 나타났다. 인간 평가 결과, 이러한 시뮬레이션된 장벽의 충실도가 검증되었다(ICC≈0.78, Pearson r≈0.80). 나아가 우리는 적응 전략(복구 지침 및 상호작용적 학습)이 장벽 없는 성능에는 미치지 못하는 제한적인 효과만 있음을 입증한다. 본 연구는 사회적 상호작용 환경을 실제 의사소통에 더 가깝게 만드는 한 걸음을 내디뎠으며, LLM 에이전트의 사회적 지능을 탐구할 기회를 열어준다.

Original Abstract

Large language models (LLMs) are increasingly evaluated in interactive environments to test their social intelligence. However, existing benchmarks often assume idealized communication between agents, limiting our ability to diagnose whether LLMs can maintain and repair interactions in more realistic, imperfect settings. To close this gap, we present \textsc{SocialVeil}, a social learning environment that can simulate social interaction under cognitive-difference-induced communication barriers. Grounded in a systematic literature review of communication challenges in human interaction, \textsc{SocialVeil} introduces three representative types of such disruption, \emph{semantic vagueness}, \emph{sociocultural mismatch}, and \emph{emotional interference}. We also introduce two barrier-aware evaluation metrics, \emph{unresolved confusion} and \emph{mutual understanding}, to evaluate interaction quality under impaired communication. Experiments across 720 scenarios and four frontier LLMs show that barriers consistently impair performance, with mutual understanding reduced by over 45\% on average, and confusion elevated by nearly 50\%. Human evaluations validate the fidelity of these simulated barriers (ICC$\approx$0.78, Pearson r$\approx$0.80). We further demonstrate that adaptation strategies (Repair Instruction and Interactive learning) only have a modest effect far from barrier-free performance. This work takes a step toward bringing social interaction environments closer to real-world communication, opening opportunities for exploring the social intelligence of LLM agents.

1 Citations

0 Influential

9 Altmetric

46.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!