2601.21898v1 Jan 29, 2026 cs.AI

스케일링 민감형 손실 지형을 통해 모델을 병합 불가능하게 만들기

Making Models Unmergeable via Scaling-Sensitive Loss Landscape

Minwoo Jang

Citations: 68

h-index: 2

Hoyoung Kim

Citations: 59

h-index: 4

Jabin Koo

Citations: 21

h-index: 1

Jungseul Ok

Citations: 1,015

h-index: 16

모델 허브의 부상으로 재사용 가능한 모델 구성 요소에 대한 접근이 용이해지면서, 모델 병합은 능력을 결합하는 실용적인 도구가 되었습니다. 그러나 이러한 모듈성은 '거버넌스 공백'을 초래하기도 합니다. 즉, 다운스트림 사용자가 공개된 가중치를 재구성하여 안전 정렬이나 라이선스 조건을 우회하는 승인되지 않은 혼합 모델을 만들 수 있습니다. 기존 방어책들은 대부분 사후적이고 특정 아키텍처에 국한되어 있어, 실제로는 다양한 아키텍처와 배포 형식 전반에 걸쳐 일관성 있는 보호를 제공하지 못합니다. 이러한 공백을 해소하기 위해, 우리는 어댑터로 배포되든 전체 모델로 배포되든 상관없이 미세 조정(fine-tuning) 중 업데이트에 보호 기능을 인코딩하는 아키텍처 불문 보호 프레임워크인 Trap²를 제안합니다. Trap²는 아키텍처 의존적인 방식에 의존하는 대신, 병합 과정의 단순한 프록시로 가중치 리스케일링(weight re-scaling)을 사용합니다. 이 방식은 공개된 가중치가 단독으로 사용될 때는 성능을 유지하지만, 병합 시 자주 발생하는 리스케일링 상황에서는 성능을 저하시켜 무단 병합을 무력화합니다.

Original Abstract

The rise of model hubs has made it easier to access reusable model components, making model merging a practical tool for combining capabilities. Yet, this modularity also creates a \emph{governance gap}: downstream users can recompose released weights into unauthorized mixtures that bypass safety alignment or licensing terms. Because existing defenses are largely post-hoc and architecture-specific, they provide inconsistent protection across diverse architectures and release formats in practice. To close this gap, we propose \textsc{Trap}$^{2}$, an architecture-agnostic protection framework that encodes protection into the update during fine-tuning, regardless of whether they are released as adapters or full models. Instead of relying on architecture-dependent approaches, \textsc{Trap}$^{2}$ uses weight re-scaling as a simple proxy for the merging process. It keeps released weights effective in standalone use, but degrades them under re-scaling that often arises in merging, undermining unauthorized merging.

0 Citations

0 Influential

8 Altmetric

40.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!