2601.03791v1 Jan 07, 2026 cs.CL

LLM이 실제로 개인 식별 정보를 암기하는가? 큐(Cue)-제어 기반 암기 평가 프레임워크를 활용한 개인 식별 정보 유출 재검토

Do LLMs Really Memorize Personally Identifiable Information? Revisiting PII Leakage with a Cue-Controlled Memorization Framework

Xiaoyu Luo

Citations: 17

h-index: 2

Qiongxiu Li

Citations: 26

h-index: 3

Yiyi Chen

Citations: 70

h-index: 5

Johannes Bjerva

Aalborg University

Citations: 1,525

h-index: 20

대규모 언어 모델(LLM)이 개인 식별 정보(PII)를 "유출"한다는 보고가 있으며, PII 재구성 성공은 종종 암기의 증거로 해석됩니다. 본 연구에서는 LLM의 암기 평가에 대한 체계적인 재검토를 제안하며, PII 유출은 프롬프트 유도 일반화 또는 패턴 완성으로 대상 PII를 재구성할 수 없는 저-어휘 큐(lexical cue) 조건에서 평가되어야 한다고 주장합니다. 우리는 큐-저항 암기(CRM, Cue-Resistant Memorization)를 큐-제어 평가 프레임워크로 정의하고, 유효한 암기 평가를 위한 필수 조건으로 프롬프트-대상 중첩 큐에 대한 명시적인 조건을 제시합니다. CRM을 사용하여 32개 언어 및 다양한 암기 패러다임에 걸쳐 PII 유출에 대한 대규모 다국어 재평가를 수행했습니다. 완전한 접두사-접미사 완성 및 연상 재구성 등 재구성 기반 설정을 재검토한 결과, 겉보기에는 효과적이지만 실제 암기보다는 표면 형태의 큐에 의해 주로 유발되는 것으로 나타났습니다. 이러한 큐가 통제되면 재구성 성공률은 현저히 감소합니다. 또한, 큐-프리 생성 및 멤버십 추론을 추가적으로 분석한 결과, 모두 매우 낮은 참 양성률을 보였습니다. 전반적으로, 본 연구 결과는 기존에 보고된 PII 유출이 진정한 암기보다는 큐 기반 행동에 의해 더 잘 설명될 수 있음을 시사하며, LLM에서 개인 정보 관련 암기를 신뢰성 있게 정량화하기 위해서는 큐-제어 평가가 중요하다는 점을 강조합니다.

Original Abstract

Large Language Models (LLMs) have been reported to "leak" Personally Identifiable Information (PII), with successful PII reconstruction often interpreted as evidence of memorization. We propose a principled revision of memorization evaluation for LLMs, arguing that PII leakage should be evaluated under low lexical cue conditions, where target PII cannot be reconstructed through prompt-induced generalization or pattern completion. We formalize Cue-Resistant Memorization (CRM) as a cue-controlled evaluation framework and a necessary condition for valid memorization evaluation, explicitly conditioning on prompt-target overlap cues. Using CRM, we conduct a large-scale multilingual re-evaluation of PII leakage across 32 languages and multiple memorization paradigms. Revisiting reconstruction-based settings, including verbatim prefix-suffix completion and associative reconstruction, we find that their apparent effectiveness is driven primarily by direct surface-form cues rather than by true memorization. When such cues are controlled for, reconstruction success diminishes substantially. We further examine cue-free generation and membership inference, both of which exhibit extremely low true positive rates. Overall, our results suggest that previously reported PII leakage is better explained by cue-driven behavior than by genuine memorization, highlighting the importance of cue-controlled evaluation for reliably quantifying privacy-relevant memorization in LLMs.

1 Citations

0 Influential

10 Altmetric

51.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!