2602.04340v1 Feb 04, 2026 cs.CV

이중 프롬프트 튜닝을 이용한 능동적 CLIP 적응을 위한 명시적 불확실성 모델링

Explicit Uncertainty Modeling for Active CLIP Adaptation with Dual Prompt Tuning

Qian-Wei Wang

Citations: 25

h-index: 3

Yaguang Song

Citations: 0

h-index: 0

Shu-Tao Xia

Citations: 12

h-index: 2

CLIP과 같은 사전 훈련된 시각-언어 모델은 뛰어난 일반화 성능을 보이지만, 제한된 어노테이션 예산 하에서 다운스트림 이미지 분류 작업에 적응시키는 것은 여전히 어려운 과제입니다. 능동 학습 환경에서 모델은 대규모의 레이블이 없는 데이터 풀에서 어노테이션에 가장 적합한 샘플을 선택해야 합니다. 기존 방법은 일반적으로 엔트로피 기반 기준 또는 표현 클러스터링을 통해 불확실성을 추정하지만, 모델의 관점에서 불확실성을 명시적으로 모델링하지는 않습니다. 본 연구에서는 이중 프롬프트 튜닝을 기반으로 능동적 CLIP 적응을 위한 강력한 불확실성 모델링 프레임워크를 제안합니다. 우리는 CLIP의 텍스트 분기에 두 개의 학습 가능한 프롬프트를 도입합니다. 긍정 프롬프트는 경량으로 튜닝된 시각적 임베딩에 해당하는 작업별 텍스트 임베딩의 구별력을 향상시켜 분류 신뢰성을 높입니다. 반면, 부정 프롬프트는 예측된 레이블이 정확할 확률을 명시적으로 모델링하도록 훈련되어, 능동적 샘플 선택을 안내하는 원칙적인 불확실성 신호를 제공합니다. 다양한 파인튜닝 패러다임을 사용한 광범위한 실험 결과, 제안된 방법은 동일한 어노테이션 예산 하에서 기존의 능동 학습 방법보다 일관되게 우수한 성능을 보였습니다.

Original Abstract

Pre-trained vision-language models such as CLIP exhibit strong transferability, yet adapting them to downstream image classification tasks under limited annotation budgets remains challenging. In active learning settings, the model must select the most informative samples for annotation from a large pool of unlabeled data. Existing approaches typically estimate uncertainty via entropy-based criteria or representation clustering, without explicitly modeling uncertainty from the model perspective. In this work, we propose a robust uncertainty modeling framework for active CLIP adaptation based on dual-prompt tuning. We introduce two learnable prompts in the textual branch of CLIP. The positive prompt enhances the discriminability of task-specific textual embeddings corresponding to light-weight tuned visual embeddings, improving classification reliability. Meanwhile, the negative prompt is trained in an reversed manner to explicitly model the probability that the predicted label is correct, providing a principled uncertainty signal for guiding active sample selection. Extensive experiments across different fine-tuning paradigms demonstrate that our method consistently outperforms existing active learning methods under the same annotation budget.

0 Citations

0 Influential

1.5 Altmetric

7.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!