2601.13352v1 Jan 19, 2026 cs.CL

LLM-as-RNN: 메모리 업데이트 및 시퀀스 예측을 위한 순환 언어 모델

LLM-as-RNN: A Recurrent Language Model for Memory Updates and Sequence Prediction

J. B. Tamo

Citations: 43

h-index: 4

Weichen Zhao

Citations: 58

h-index: 4

Nan Sun

Citations: 17

h-index: 3

Yishan Zhong

Citations: 183

h-index: 7

Wenqi Shi

Citations: 679

h-index: 13

Jinzhuo Wang

Citations: 384

h-index: 9

May D. Wang

Citations: 192

h-index: 7

Yuxing Lu

Peking University

Citations: 751

h-index: 15

대규모 언어 모델(LLM)은 강력한 시퀀스 예측 능력을 가지고 있지만, 일반적인 추론은 변경 불가능한 컨텍스트 기록에 의존합니다. 생성 단계 t에서 오류가 발생하면 모델은 t+1 단계에 대한 예측을 개선하는 업데이트 가능한 메모리 메커니즘이 부족합니다. 본 논문에서는 'LLM-as-RNN'이라는 추론 전용 프레임워크를 제안합니다. 이 프레임워크는 고정된 LLM을 순환 예측기로 변환하며, LLM의 은닉 상태를 자연어 메모리로 표현합니다. 이 상태는 구조화된 시스템 프롬프트 요약으로 구현되며, 피드백 기반 텍스트 재작성을 통해 각 타임스텝에서 업데이트됩니다. 이를 통해 파라미터 업데이트 없이 학습이 가능합니다. 고정된 토큰 예산 내에서 LLM-as-RNN은 오류를 수정하고 작업 관련 패턴을 유지하며, 언어를 통해 온라인 학습을 효과적으로 수행합니다. 본 연구에서는 Llama, Gemma, GPT 모델 패밀리를 사용하여 의료, 기상, 금융 분야의 세 가지 순차적 벤치마크에서 이 방법을 평가했습니다. LLM-as-RNN은 제로샷, 전체 기록, MemPrompt 기반 모델보다 성능이 뛰어나 평균적으로 예측 정확도가 6.5% 향상되었으며, 표준 컨텍스트 누적 방식에서는 나타나지 않는 해석 가능하고 사람이 읽기 쉬운 학습 추적 결과를 제공합니다.

Original Abstract

Large language models are strong sequence predictors, yet standard inference relies on immutable context histories. After making an error at generation step t, the model lacks an updatable memory mechanism that improves predictions for step t+1. We propose LLM-as-RNN, an inference-only framework that turns a frozen LLM into a recurrent predictor by representing its hidden state as natural-language memory. This state, implemented as a structured system-prompt summary, is updated at each timestep via feedback-driven text rewrites, enabling learning without parameter updates. Under a fixed token budget, LLM-as-RNN corrects errors and retains task-relevant patterns, effectively performing online learning through language. We evaluate the method on three sequential benchmarks in healthcare, meteorology, and finance across Llama, Gemma, and GPT model families. LLM-as-RNN significantly outperforms zero-shot, full-history, and MemPrompt baselines, improving predictive accuracy by 6.5% on average, while producing interpretable, human-readable learning traces absent in standard context accumulation.

1 Citations

0 Influential

7.5 Altmetric

38.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!