2604.10023v1 Apr 11, 2026 cs.CV

FREE-Switch: 스타일 변환을 위한 주파수 기반 동적 LoRA 스위칭

FREE-Switch: Frequency-based Dynamic LoRA Switch for Style Transfer

Minyu Zhang

Citations: 0

h-index: 0

Hongzhi Wang

Citations: 65

h-index: 5

Tianhao Liu

Citations: 7

h-index: 2

Shenghe Zheng

Citations: 96

h-index: 7

다양한 장면과 객체에 대해 동일한 디퓨전 기반 모델로 훈련된 오픈 소스 어댑터의 활용이 증가함에 따라, 이러한 사전 훈련된 가중치를 결합하여 저렴한 비용으로 맞춤형 생성이 가능합니다. 그러나 대부분의 기존 모델 병합 방법은 분류 또는 텍스트 생성에 설계되었으며, 이미지 생성에 적용할 경우 여러 디퓨전 단계에서 발생하는 오류로 인해 내용의 일관성이 깨지는 문제가 발생합니다. 이미지 관련 방법의 경우, 훈련 기반 접근 방식은 계산 비용이 많이 들고 엣지 환경에 적합하지 않으며, 훈련이 필요 없는 방식은 어댑터 간의 차이를 무시하는 균일한 융합 전략을 사용하여 세부 정보의 손실을 초래합니다. 우리는 어댑터가 생성하는 콘텐츠의 유형에 따라 각 디퓨전 단계의 중요도가 다르다는 것을 발견했습니다. 이에 따라, 주파수 영역의 중요도를 기반으로 하는 동적 LoRA 스위칭 방법을 제안합니다. 또한, 어댑터 간의 의미적 일관성을 유지하는 것이 세부 정보 손실을 완화하는 데 효과적이라는 것을 확인하고, 생성 의도를 의미 수준에서 일치시키기 위한 자동 생성 정렬(Generation Alignment) 메커니즘을 설계했습니다. 실험 결과, 우리의 FREE-Switch (Frequency-based Efficient and Dynamic LoRA Switch) 프레임워크는 다양한 객체와 스타일을 위한 어댑터를 효율적으로 결합하여 고품질 맞춤형 생성을 위한 훈련 비용을 크게 줄이는 것을 보여줍니다.

Original Abstract

With the growing availability of open-sourced adapters trained on the same diffusion backbone for diverse scenes and objects, combining these pretrained weights enables low-cost customized generation. However, most existing model merging methods are designed for classification or text generation, and when applied to image generation, they suffer from content drift due to error accumulation across multiple diffusion steps. For image-oriented methods, training-based approaches are computationally expensive and unsuitable for edge deployment, while training-free ones use uniform fusion strategies that ignore inter-adapter differences, leading to detail degradation. We find that since different adapters are specialized for generating different types of content, the contribution of each diffusion step carries different significance for each adapter. Accordingly, we propose a frequency-domain importance-driven dynamic LoRA switch method. Furthermore, we observe that maintaining semantic consistency across adapters effectively mitigates detail loss; thus, we design an automatic Generation Alignment mechanism to align generation intents at the semantic level. Experiments demonstrate that our FREE-Switch (Frequency-based Efficient and Dynamic LoRA Switch) framework efficiently combines adapters for different objects and styles, substantially reducing the training cost of high-quality customized generation.

0 Citations

0 Influential

3.5 Altmetric

17.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!