2603.09892v1 Mar 10, 2026 cs.LG

MSSR: 메모리 기반 적응형 리플레이를 이용한 지속적인 LLM 미세 조정

MSSR: Memory-Aware Adaptive Replay for Continual LLM Fine-Tuning

Yujia Liu

Citations: 0

h-index: 0

Hongyuan Zha

Citations: 98

h-index: 4

Jianlong Chen

Citations: 18

h-index: 3

Yiyang Lu

Citations: 5

h-index: 2

대규모 언어 모델(LLM)은 동적인 환경에서 배포되면서, 시간이 지남에 따라 작업과 데이터 분포가 변화하기 때문에, 지속적인 미세 조정이 점점 더 중요해지고 있습니다. 강력한 적응성은 새로운 지식을 빠르게 습득하는 데 도움이 되지만, 순차적인 학습 과정에서 이전에 학습한 기술이 저하되는 재앙적 망각(catastrophic forgetting) 문제를 야기합니다. 기존의 리플레이 기반 전략(예: 고정 간격 리플레이, 정확도 기반, 손실 기반 스케줄링)은 여전히 한계가 있습니다. 일부 전략은 휴리스틱 규칙에 의존하여 망각을 부분적으로 완화하는 데 그치며, 다른 전략은 성능을 향상시키지만 상당한 계산 비용을 발생시킵니다. 본 연구에서는 순차적인 미세 조정 과정에서의 정보 유지 메커니즘에 착안하여, 샘플 수준의 기억 강도를 추정하고, 재앙적 망각을 완화하면서 빠른 적응을 유지하기 위해 적응적인 간격으로 재학습을 스케줄링하는 경험 리플레이 프레임워크인 Memory-Inspired Sampler and Scheduler Replay (MSSR)를 제안합니다. 세 가지 기본 모델과 11가지 순차적 작업에 대한 광범위한 실험 결과, MSSR은 최첨단 리플레이 기반 모델을 지속적으로 능가하며, 특히 추론 중심 및 객관식 벤치마크에서 뛰어난 성능 향상을 보였습니다.

Original Abstract

Continual fine-tuning of large language models (LLMs) is becoming increasingly crucial as these models are deployed in dynamic environments where tasks and data distributions evolve over time. While strong adaptability enables rapid acquisition of new knowledge, it also exposes LLMs to catastrophic forgetting, where previously learned skills degrade during sequential training. Existing replay-based strategies, such as fixed interleaved replay, accuracy-supervised, and loss-driven scheduling, remain limited: some depend on heuristic rules and provide only partial mitigation of forgetting, while others improve performance but incur substantial computational overhead. Motivated by retention dynamics under sequential fine-tuning, we propose Memory-Inspired Sampler and Scheduler Replay (MSSR), an experience replay framework that estimates sample-level memory strength and schedules rehearsal at adaptive intervals to mitigate catastrophic forgetting while maintaining fast adaptation. Extensive experiments across three backbone models and 11 sequential tasks show that MSSR consistently outperforms state-of-the-art replay baselines, with particularly strong gains on reasoning-intensive and multiple-choice benchmarks.

3 Citations

0 Influential

2 Altmetric

13.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!