2605.08060v1 May 08, 2026 cs.CL

기억의 저주: 확장된 기억력이 LLM 에이전트의 협력 의도를 약화시키는 방법

The Memory Curse: How Expanded Recall Erodes Cooperative Intent in LLM Agents

Emanuel Tewolde

Citations: 245

h-index: 7

Vincent Conitzer

Citations: 232

h-index: 7

Jiayuan Liu

Citations: 12

h-index: 2

Shiyi Du

Citations: 47

h-index: 2

Tianqi Li

Citations: 22

h-index: 2

Xinpeng Luo

Citations: 2

h-index: 1

Hao Zeng

Citations: 9

h-index: 2

Tai Sing Lee

Citations: 40

h-index: 3

Tonghan Wang

Tsinghua University

Citations: 1,638

h-index: 15

Carl Kingsford

Citations: 5

h-index: 1

Xin Luo

Citations: 32

h-index: 4

컨텍스트 윈도우 확장은 종종 LLM의 단순한 기능 향상으로 간주되지만, 본 연구에서는 다중 에이전트 사회적 딜레마 상황에서 이러한 확장이 체계적으로 실패하는 것을 확인했습니다. 7개의 LLM과 4개의 게임을 500라운드 동안 진행한 결과, 접근 가능한 과거 정보의 확장은 28개의 LLM-게임 조합 중 18개에서 협력을 저해하는 경향을 보였으며, 우리는 이러한 현상을 '기억의 저주'라고 명명했습니다. 본 연구는 세 가지 분석을 통해 이러한 현상의 근본적인 메커니즘을 규명했습니다. 첫째, 37만 8천 건의 추론 기록에 대한 어휘 분석 결과, 협력 저해는 파괴적인 의도 약화와 관련이 있으며, 단순히 편집증 증가와는 관련이 없는 것으로 나타났습니다. 이러한 결과는 목표 지향적인 추론 기록에만 학습된 LoRA 어댑터를 사용하여 검증되었으며, 이 어댑터는 성능 저하를 완화하고 다른 게임에서도 즉시 적용될 수 있었습니다. 둘째, '메모리 위생' 기법을 사용하여 프롬프트 길이는 고정한 채 가시적인 과거 정보를 인공적인 협력 기록으로 대체함으로써, 협력이 크게 회복되는 것을 확인했습니다. 이는 협력 저해의 원인이 단순히 과거 정보의 길이 자체가 아니라, 그 내용에 있다는 것을 입증합니다. 셋째, 명시적인 '사고의 사슬' 추론 기능을 제거하면 종종 붕괴가 감소하는 것으로 나타났습니다. 이는 숙고 과정이 역설적으로 '기억의 저주'를 증폭시킬 수 있음을 보여줍니다. 종합적으로, 본 연구의 결과는 기억을 다중 에이전트 행동의 적극적인 결정 요인으로 재정의합니다. 더 긴 기억력은 유도되는 추론 패턴에 따라 협력을 불안정하게 만들거나 지원할 수 있습니다.

Original Abstract

Context window expansion is often treated as a straightforward capability upgrade for LLMs, but we find it systematically fails in multi-agent social dilemmas. Across 7 LLMs and 4 games over 500 rounds, expanding accessible history degrades cooperation in 18 of 28 model--game settings, a pattern we term the memory curse. We isolate the underlying mechanism through three analyses. First, lexical analysis of 378,000 reasoning traces associates this breakdown with eroding forward-looking intent rather than rising paranoia. We validate this using targeted fine-tuning as a cognitive probe: a LoRA adapter trained exclusively on forward-looking traces mitigates the decay and transfers zero-shot to distinct games. Second, memory sanitization holds prompt length fixed while replacing visible history with synthetic cooperative records, which restores cooperation substantially, proving the trigger is memory content, not length alone. Finally, ablating explicit Chain-of-Thought reasoning often reduces the collapse, showing that deliberation paradoxically amplifies the memory curse. Together, these results recast memory as an active determinant of multi-agent behavior: longer recall can either destabilize or support cooperation depending on the reasoning patterns it elicits.

2 Citations

0 Influential

7.5 Altmetric

39.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!