2605.04453v1 May 06, 2026 cs.CV

StableI2I: 이미지-이미지 변환 과정에서 의도치 않은 변경 사항 감지

StableI2I: Spotting Unintended Changes in Image-to-Image Transition

Jian Zhang

Citations: 16

h-index: 2

Jiayang Li

Citations: 75

h-index: 4

Shuo Cao

Citations: 101

h-index: 6

Xiaohui Li

Citations: 111

h-index: 5

Zhizheng Zhang

Citations: 56

h-index: 4

Kaiwen Zhu

Citations: 245

h-index: 7

Yule Duan

Citations: 175

h-index: 6

Yu Qiao

Citations: 916

h-index: 12

Yihao Liu

Citations: 8,383

h-index: 31

대부분의 실제 이미지-이미지(I2I) 변환 시나리오에서, 기존의 평가는 주로 생성된 이미지의 지시 사항 준수 여부, 그리고 시각적 품질 또는 심미성에 초점을 맞춥니다. 그러나 이러한 평가는 출력 이미지가 입력 이미지의 의미적 일관성 및 공간적 구조를 유지하는지 여부를 평가하는 데는 한계가 있습니다. 이러한 한계를 극복하기 위해, 우리는 참조 이미지가 필요 없는 다양한 I2I 작업(예: 이미지 편집 및 이미지 복원)에 걸쳐 콘텐츠 충실도와 사전-사후 일관성을 명시적으로 측정하는 통합적이고 동적인 평가 프레임워크인 StableI2I를 제안합니다. 또한, 우리는 이러한 충실도 및 일관성 평가 작업에서 MLLM(대규모 언어 모델)의 정확성을 체계적으로 평가하기 위해 설계된 벤치마크인 StableI2I-Bench를 구축했습니다. 광범위한 실험 결과는 StableI2I가 콘텐츠 충실도 및 일관성에 대한 정확하고 세밀하며 해석 가능한 평가를 제공하며, 이는 인간의 주관적인 판단과 높은 상관관계를 보인다는 것을 보여줍니다. 우리의 프레임워크는 실제 I2I 시스템에서 콘텐츠 일관성을 진단하고 모델 성능을 벤치마킹하는 데 유용하고 신뢰할 수 있는 평가 도구로 활용될 수 있습니다.

Original Abstract

In most real-world image-to-image (I2I) scenarios, existing evaluations primarily focus on instruction following and the perceptual quality or aesthetics of the generated images. However, they largely fail to assess whether the output image preserves the semantic correspondence and spatial structure of the input image. To address this limitation, we propose StableI2I, a unified and dynamic evaluation framework that explicitly measures content fidelity and pre--post consistency across a wide range of I2I tasks without requiring reference images, including image editing and image restoration. In addition, we construct StableI2I-Bench, a benchmark designed to systematically evaluate the accuracy of MLLMs on such fidelity and consistency assessment tasks. Extensive experimental results demonstrate that StableI2I provides accurate, fine-grained, and interpretable evaluations of content fidelity and consistency, with strong correlations to human subjective judgments. Our framework serves as a practical and reliable evaluation tool for diagnosing content consistency and benchmarking model performance in real-world I2I systems.

0 Citations

0 Influential

15.5 Altmetric

77.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!