2601.21682v1 Jan 29, 2026 cs.CL

FIT: 지속적인 LLM 지우기 과정에서 발생하는 재앙적 망각 현상 극복

FIT: Defying Catastrophic Forgetting in Continual LLM Unlearning

XiaoYu Xu

Citations: 98

h-index: 3

Minxin Du

Citations: 200

h-index: 4

Zi Liang

Citations: 62

h-index: 4

Qingqing Ye

Citations: 1,346

h-index: 18

Kun Fang

Citations: 0

h-index: 0

Yaxin Xiao

Citations: 52

h-index: 4

Zhicong Huang

Citations: 159

h-index: 5

Cheng Hong

Citations: 22

h-index: 3

Haibo Hu

Citations: 611

h-index: 13

대규모 언어 모델(LLM)은 다양한 작업에서 뛰어난 성능을 보이지만, 개인 정보 보호, 저작권 및 유해 콘텐츠 문제에 대한 우려를 야기합니다. 기존의 LLM 지우기 방법은 실제 환경에서 발생하는 지속적이고 대량의 삭제 요청을 충분히 고려하지 못하여, 요청이 누적됨에 따라 모델의 성능 저하 및 재앙적 망각 현상을 초래할 수 있습니다. 이러한 문제를 해결하기 위해, 우리는 대량의 삭제 요청을 처리하면서 재앙적 망각 현상 및 지우기 이후의 복구 시도에 대한 견고성을 유지하는 지속적인 지우기 프레임워크인 it을 제안합니다. it은 엄격한 데이터 필터링, 중요도 기반 업데이트 및 타겟 레이어 속성화를 통해 성능 저하를 완화하며, 장기간의 지우기 작업에서도 안정적인 성능을 유지하고 망각 효과와 유틸리티 유지 간의 균형을 맞춥니다. 실제 평가를 지원하기 위해, 개인 정보, 저작권 및 유해 콘텐츠를 포함하는 순차적 삭제 시나리오를 다루는 벤치마크인 extbf{PCH}를 제시하며, 망각 품질과 유틸리티 보존을 동시에 평가하는 두 가지 대칭적인 지표인 망각 정도(F.D.) 및 유틸리티 유지율(R.U.)을 함께 제공합니다. 네 가지 오픈 소스 LLM에 대한 광범위한 실험 결과, it이 F.D.와 R.U. 간의 가장 뛰어난 균형을 달성하고, MMLU, CommonsenseQA 및 GSM8K에서 기존 방법보다 우수한 성능을 보이며, 재학습 및 양자화 복구 공격에 대한 저항력을 유지하는 것으로 나타났습니다.

Original Abstract

Large language models (LLMs) demonstrate impressive capabilities across diverse tasks but raise concerns about privacy, copyright, and harmful materials. Existing LLM unlearning methods rarely consider the continual and high-volume nature of real-world deletion requests, which can cause utility degradation and catastrophic forgetting as requests accumulate. To address this challenge, we introduce \fit, a framework for continual unlearning that handles large numbers of deletion requests while maintaining robustness against both catastrophic forgetting and post-unlearning recovery. \fit mitigates degradation through rigorous data \underline{F}iltering, \underline{I}mportance-aware updates, and \underline{T}argeted layer attribution, enabling stable performance across long sequences of unlearning operations and achieving a favorable balance between forgetting effectiveness and utility retention. To support realistic evaluation, we present \textbf{PCH}, a benchmark covering \textbf{P}ersonal information, \textbf{C}opyright, and \textbf{H}armful content in sequential deletion scenarios, along with two symmetric metrics, Forget Degree (F.D.) and Retain Utility (R.U.), which jointly assess forgetting quality and utility preservation. Extensive experiments on four open-source LLMs with hundreds of deletion requests show that \fit achieves the strongest trade-off between F.D. and R.U., surpasses existing methods on MMLU, CommonsenseQA, and GSM8K, and remains resistant against both relearning and quantization recovery attacks.

0 Citations

0 Influential

9 Altmetric

45.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!