2602.23798v1 Feb 27, 2026 cs.LG

MPU: 대규모 언어 모델을 위한 안전하고 프라이버시를 보호하는 지식 삭제 기술

MPU: Towards Secure and Privacy-Preserving Knowledge Unlearning for Large Language Models

Wei Yang Bryan Lim

Citations: 51

h-index: 3

Tiantong Wang

Citations: 112

h-index: 4

Xinyu Yan

Citations: 0

h-index: 0

Tiantong Wu

Citations: 0

h-index: 0

Yurong Hao

Citations: 6

h-index: 2

Yong Jiang

Citations: 1,704

h-index: 22

Fei Huang

Citations: 1,783

h-index: 23

대규모 언어 모델의 기계 학습 삭제는 종종 프라이버시 딜레마에 직면하는데, 이는 엄격한 제약 조건으로 인해 서버의 파라미터 또는 클라이언트의 삭제 집합을 공유하는 것을 금지하기 때문입니다. 이러한 이중 비공개 제약을 해결하기 위해, 우리는 알고리즘에 독립적인 프라이버시 보호 다중 난독화 복사 삭제(MPU) 프레임워크를 제안합니다. MPU는 주로 서버 측 모듈 두 가지, 즉 난독화 복사 생성을 위한 사전 처리(Pre-Process) 모듈과 업데이트 통합을 위한 사후 처리(Post-Process) 모듈을 도입합니다. 사전 처리 단계에서 서버는 여러 개의 난독화되고 재구성된 모델 인스턴스를 분산하여 클라이언트가 서버의 정확한 원래 파라미터를 액세스하지 않고도 자체적인 비공개 삭제 집합에서 로컬 삭제를 수행할 수 있도록 합니다. 로컬 삭제 후, 서버는 재구성을 역전시키고, 조화로운 노이즈 제거 절차를 사용하여 업데이트를 통합함으로써 난독화의 영향을 완화하는 사후 처리를 수행합니다. 7가지 삭제 알고리즘에 대한 실험 결과, MPU는 노이즈 없는 기준 성능과 비교 가능한 삭제 성능을 달성하며, 대부분의 알고리즘에서 평균 성능 저하가 10% 노이즈 조건에서 1% 미만으로 유지됩니다. 또한, 일부 알고리즘에서는 노이즈 없는 기준 성능을 능가할 수도 있습니다. 코드 및 관련 자료는 다음 주소에서 확인할 수 있습니다: https://github.com/Tristan-SHU/MPU.

Original Abstract

Machine unlearning for large language models often faces a privacy dilemma in which strict constraints prohibit sharing either the server's parameters or the client's forget set. To address this dual non-disclosure constraint, we propose MPU, an algorithm-agnostic privacy-preserving Multiple Perturbed Copies Unlearning framework that primarily introduces two server-side modules: Pre-Process for randomized copy generation and Post-Process for update aggregation. In Pre-Process, the server distributes multiple perturbed and reparameterized model instances, allowing the client to execute unlearning locally on its private forget set without accessing the server's exact original parameters. After local unlearning, the server performs Post-Process by inverting the reparameterization and aggregating updates with a harmonic denoising procedure to alleviate the impact of perturbation. Experiments with seven unlearning algorithms show that MPU achieves comparable unlearning performance to noise-free baselines, with most algorithms' average degradation well below 1% under 10% noise, and can even outperform the noise-free baseline for some algorithms under 1% noise. Code is available at https://github.com/Tristan-SHU/MPU.

0 Citations

0 Influential

31.5 Altmetric

157.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!