2604.01904v1 Apr 02, 2026 cs.CR

LLM 훈련 과정에서의 데이터 불법 사용 방지: 데이터 세탁 대응

Combating Data Laundering in LLM Training

Sharon Li

Citations: 278

h-index: 5

Zesheng Ye

Citations: 47

h-index: 3

Feng Liu

Citations: 109

h-index: 1

Muxing Li

Citations: 5

h-index: 1

데이터 권리 소유자는 자체 샘플을 사용하여 대규모 언어 모델(LLM) 훈련 과정에서 발생할 수 있는 무단 데이터 사용을 탐지할 수 있습니다. 일반적으로, 훈련되지 않은 데이터에 비해 특정 샘플에서 더 높은 성능(예: 더 높은 신뢰도 또는 더 낮은 손실)을 보이는 경우, 해당 샘플이 훈련 데이터셋의 일부였을 가능성이 높습니다. LLM은 훈련 과정에서 학습한 데이터에 대해 일반적으로 더 나은 성능을 보이기 때문입니다. 그러나 데이터 세탁이라는 행위는 이러한 탐지 기능을 약화시킬 수 있습니다. 데이터 세탁은 데이터의 스타일을 변환하여 데이터 출처를 숨기는 행위이지만, 중요한 정보는 그대로 유지합니다. LLM이 이러한 변형된 데이터만으로 훈련된 경우, 원본 데이터에 대한 성능이 더 이상 향상되지 않으므로, 기존의 탐지 방법이 의존하는 신호를 사라지게 합니다. 본 연구에서는 블랙박스 방식으로 대상 LLM에 접근하여 알 수 없는 데이터 세탁 변환을 추론하고, 보조 LLM을 사용하여 권리 소유자가 원본 데이터만 가지고 있는 경우에도 데이터 세탁된 데이터를 모방하는 쿼리를 생성합니다. 데이터 세탁 변환을 찾는 것은 사실상 무한한 탐색 공간을 가지므로, 우리는 이러한 과정을 고수준의 변환 목표(예:

Original Abstract

Data rights owners can detect unauthorized data use in large language model (LLM) training by querying with proprietary samples. Often, superior performance (e.g., higher confidence or lower loss) on a sample relative to the untrained data implies it was part of the training corpus, as LLMs tend to perform better on data they have seen during training. However, this detection becomes fragile under data laundering, a practice of transforming the stylistic form of proprietary data, while preserving critical information to obfuscate data provenance. When an LLM is trained exclusively on such laundered variants, it no longer performs better on originals, erasing the signals that standard detections rely on. We counter this by inferring the unknown laundering transformation from black-box access to the target LLM and, via an auxiliary LLM, synthesizing queries that mimic the laundered data, even if rights owners have only the originals. As the search space of finding true laundering transformations is infinite, we abstract such a process into a high-level transformation goal (e.g., "lyrical rewriting") and concrete details (e.g., "with vivid imagery"), and introduce synthesis data reversion (SDR) that instantiates this abstraction. SDR first identifies the most probable goal for synthesis to narrow the search; it then iteratively refines details so that synthesized queries gradually elicit stronger detection signals from the target LLM. Evaluated on the MIMIR benchmark against diverse laundering practices and target LLM families (Pythia, Llama2, and Falcon), SDR consistently strengthens data misuse detection, providing a practical countermeasure to data laundering.

0 Citations

0 Influential

2.5 Altmetric

12.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!