2602.08563v1 Feb 09, 2026 cs.LG

무상태이지만 기억력이 있는: LLM에서 암묵적 기억이 숨겨진 통로로 작용하는 현상

Stateless Yet Not Forgetful: Implicit Memory as a Hidden Channel in LLMs

Sahar Abdelnabi

Citations: 152

h-index: 5

Ahmed Salem

Citations: 47

h-index: 3

A. Paverd

Citations: 221

h-index: 6

대규모 언어 모델(LLM)은 일반적으로 무상태로 취급됩니다. 즉, 상호 작용이 종료되면 명시적으로 저장되고 다시 제공되지 않는 한 정보가 지속되지 않는다고 가정합니다. 본 연구에서는 이러한 가정에 도전하며, 모델이 독립적인 상호 작용 간에 상태를 유지하는 능력, 즉 '암묵적 기억'을 소개합니다. 암묵적 기억은 모델이 자신의 출력에 정보를 인코딩하고, 나중에 해당 출력이 입력으로 다시 제공될 때 이를 복구하는 메커니즘을 통해 작동합니다. 이 메커니즘은 명시적인 메모리 모듈을 필요로 하지 않지만, 추론 요청 간에 지속적인 정보 채널을 생성합니다. 구체적인 예시로, '타임 бомб(time bombs)'이라는 새로운 유형의 시간 기반 백도어를 소개합니다. 기존의 백도어는 단일 트리거 입력에 의해 활성화되는 반면, 타임 бомб는 암묵적 기억을 통해 축적된 숨겨진 조건이 충족된 후에만 활성화됩니다. 우리는 간단한 프롬프트 또는 미세 조정만으로도 이러한 동작을 유도할 수 있음을 보여줍니다. 이 사례 연구 외에도, 암묵적 기억의 더 광범위한 함의를 분석합니다. 여기에는 은밀한 에이전트 간 통신, 벤치마크 오염, 표적 조작, 그리고 학습 데이터 오염 등이 포함됩니다. 마지막으로, 탐지 과제를 논의하고 스트레스 테스트 및 평가를 위한 방향을 제시하여 미래의 발전을 예측하고 제어하는 것을 목표로 합니다. 향후 연구를 촉진하기 위해, 코드와 데이터를 다음 주소에서 공개합니다: https://github.com/microsoft/implicitMemory.

Original Abstract

Large language models (LLMs) are commonly treated as stateless: once an interaction ends, no information is assumed to persist unless it is explicitly stored and re-supplied. We challenge this assumption by introducing implicit memory-the ability of a model to carry state across otherwise independent interactions by encoding information in its own outputs and later recovering it when those outputs are reintroduced as input. This mechanism does not require any explicit memory module, yet it creates a persistent information channel across inference requests. As a concrete demonstration, we introduce a new class of temporal backdoors, which we call time bombs. Unlike conventional backdoors that activate on a single trigger input, time bombs activate only after a sequence of interactions satisfies hidden conditions accumulated via implicit memory. We show that such behavior can be induced today through straightforward prompting or fine-tuning. Beyond this case study, we analyze broader implications of implicit memory, including covert inter-agent communication, benchmark contamination, targeted manipulation, and training-data poisoning. Finally, we discuss detection challenges and outline directions for stress-testing and evaluation, with the goal of anticipating and controlling future developments. To promote future research, we release code and data at: https://github.com/microsoft/implicitMemory.

1 Citations

0 Influential

37.97866136777 Altmetric

190.9 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!