2602.10042v2 Feb 10, 2026 cs.CV

Fake-HR1: 합성 이미지 탐지를 위한 시각 언어 모델의 추론 방식 재고

Fake-HR1: Rethinking Reasoning of Vision Language Model for Synthetic Image Detection

Fengchang Yu

Citations: 85

h-index: 5

Changjiang Jiang

Citations: 39

h-index: 3

Xinkuan Sha

Citations: 5

h-index: 1

Mingqi Fang

Citations: 29

h-index: 4

Chenfeng Zhang

Citations: 7

h-index: 1

Wei Lu

Citations: 220

h-index: 9

Jingjing Liu

Citations: 12

h-index: 2

Jian Liu

Citations: 27

h-index: 3

최근 연구에 따르면, 추론(Chain-of-Thought, CoT)을 탐지 과정에 통합하면 모델이 합성 이미지를 탐지하는 능력을 향상시킬 수 있습니다. 그러나 지나치게 긴 추론은 상당한 자원 오버헤드, 즉 토큰 소비 및 지연 시간을 발생시키며, 특히 명백하게 생성된 위조물을 처리할 때는 불필요한 경우가 많습니다. 이러한 문제를 해결하기 위해, 본 논문에서는 생성 탐지 작업의 특성에 따라 추론이 필요한지 여부를 적응적으로 결정하는 최초의 대규모 하이브리드 추론 모델인 Fake-HR1을 제안합니다. 이를 위해, 우리는 두 단계의 학습 프레임워크를 설계했습니다. 먼저, 초기화 단계에서는 하이브리드 미세 조정(Hybrid Fine-Tuning, HFT)을 수행하고, 그 다음에는 하이브리드 추론 그룹 정책 최적화(Hybrid-Reasoning Grouped Policy Optimization, HGRPO)를 통해 적절한 추론 모드를 언제 선택해야 하는지를 암묵적으로 학습합니다. 실험 결과는 Fake-HR1이 다양한 유형의 쿼리에 대해 적응적으로 추론을 수행하며, 기존 LLM보다 추론 능력과 생성 탐지 성능 모두에서 뛰어난 성능을 보일 뿐만 아니라 응답 효율성도 크게 향상시키는 것을 보여줍니다.

Original Abstract

Recent studies have demonstrated that incorporating Chain-of-Thought (CoT) reasoning into the detection process can enhance a model's ability to detect synthetic images. However, excessively lengthy reasoning incurs substantial resource overhead, including token consumption and latency, which is particularly redundant when handling obviously generated forgeries. To address this issue, we propose Fake-HR1, a large-scale hybrid-reasoning model that, to the best of our knowledge, is the first to adaptively determine whether reasoning is necessary based on the characteristics of the generative detection task. To achieve this, we design a two-stage training framework: we first perform Hybrid Fine-Tuning (HFT) for cold-start initialization, followed by online reinforcement learning with Hybrid-Reasoning Grouped Policy Optimization (HGRPO) to implicitly learn when to select an appropriate reasoning mode. Experimental results show that Fake-HR1 adaptively performs reasoning across different types of queries, surpassing existing LLMs in both reasoning ability and generative detection performance, while significantly improving response efficiency.

5 Citations

0 Influential

4.5 Altmetric

27.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!