2604.26348v1 Apr 29, 2026 cs.CV

ACPO: 앵커 제약 기반의 지각적 최적화 - 참조 이미지 없이 품질 가이드 정보를 활용한 확산 모델

ACPO: Anchor-Constrained Perceptual Optimization for Diffusion Models with No-Reference Quality Guidance

Han Fang

Citations: 1,398

h-index: 18

Fei Meng

Citations: 3

h-index: 1

Yang Yang

Citations: 29

h-index: 3

Weiming Zhang

Citations: 26

h-index: 2

확산 모델은 이미지 생성 분야에서 놀라운 성공을 거두었지만, 대부분의 학습은 기준 이미지와의 픽셀 단위 유사성을 강제하는 전체 참조 기반의 목표 함수에 의해 이루어집니다. 이러한 감독은 충실도를 높이는 데 효과적이지만, 주관적인 시각적 품질 및 텍스트-이미지 의미 일관성 측면에서는 충분하지 않을 수 있습니다. 본 연구에서는 참조 이미지 없이 지각적 품질을 확산 모델 학습에 통합하는 문제를 다룹니다. 주요 과제는 참조 없는 이미지 품질 평가(NR-IQA) 모델과 같은 지각적 신호를 직접 최적화하면 원래의 확산 목표 함수와 불일치가 발생하여 학습 불안정성과 미세 조정 시 분포 변화가 발생한다는 점입니다. 이 문제를 해결하기 위해, 우리는 안정적인 지각적 적응을 가능하게 하는 앵커 제약 기반 최적화 프레임워크를 제안합니다. 구체적으로, 학습된 NR-IQA 모델을 지각적 가이드 신호로 활용하면서, 앵커 기반 정규화를 도입하여 노이즈 예측 측면에서 기본 확산 모델과의 일관성을 유지합니다. 이러한 설계는 지각적 품질 향상과 생성 충실도 사이의 균형을 효과적으로 유지하여, 원래의 생성 동작을 손상시키지 않으면서 지각적으로 더 나은 결과를 얻도록 제어된 적응을 가능하게 합니다. 광범위한 실험 결과, 제안된 방법은 지각적 품질을 지속적으로 향상시키면서 생성 다양성과 학습 안정성을 유지하는 것으로 나타났으며, 이는 앵커 제약 기반의 지각적 최적화가 확산 모델에 효과적임을 보여줍니다.

Original Abstract

Diffusion models have achieved remarkable success in image generation, yet their training is predominantly driven by full-reference objectives that enforce pixel-wise similarity to ground-truth images.Such supervision, while effective for fidelity, may insufficient in terms of subjective visual perception quality and text-image semantic consistency. In this work, we investigate the problem of incorporating no-reference perceptual quality into diffusion training. A key challenge is that directly optimizing perceptual signals, such as those provided by no-reference image quality assessment (NR-IQA) models, introduces a mismatch with the original diffusion objective, leading to training instability and distributional drift during fine-tuning. To address this issue, we propose an anchor-constrained optimization framework that enables stable perceptual adaptation. Specifically, we leverage a learned NR-IQA model as a perceptual guidance signal, while introducing an anchor-based regularization that enforces consistency with the base diffusion model in terms of noise prediction. This design effectively balances perceptual quality improvement and generative fidelity, allowing controlled adaptation toward perceptually favorable outputs without compromising the original generative behavior. Extensive experiments demonstrate that our method consistently enhances perceptual quality while preserving generation diversity and training stability, highlighting the effectiveness of anchor-constrained perceptual optimization for diffusion models.

0 Citations

0 Influential

9 Altmetric

45.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!