2601.20791v1 Jan 28, 2026 cs.CV

FAIRT2V: 텍스트-비디오 확산 모델을 위한 훈련 불필요한 편향 완화

FAIRT2V: Training-Free Debiasing for Text-to-Video Diffusion Models

Wei Song

Citations: 19

h-index: 3

Haonan Zhong

Citations: 61

h-index: 3

Tingxu Han

Citations: 362

h-index: 8

M. Pagnucco

Citations: 2,324

h-index: 25

Jingling Xue

Citations: 1

h-index: 1

Yang Song

Citations: 678

h-index: 5

텍스트-비디오(T2V) 확산 모델은 빠른 발전을 이루었지만, 특히 성별 편향과 같은 인구 통계적 편향은 아직 충분히 연구되지 않았습니다. 본 논문에서는 텍스트-비디오 생성에 대한 훈련 불필요한 편향 완화 프레임워크인 FairT2V를 제안합니다. FairT2V는 미세 조정 없이 인코더에 의해 유발되는 편향을 완화합니다. 우리는 먼저 T2V 모델의 인구 통계적 편향을 분석하고, 이러한 편향이 주로 사전 훈련된 텍스트 인코더에서 비롯된다는 것을 보여줍니다. 이 인코더는 중립적인 프롬프트에도 불구하고 암묵적인 성별 연관성을 포함하고 있습니다. 우리는 생성된 비디오의 편향과 상관 관계가 있는 성별 지향성 점수를 사용하여 이러한 효과를 정량화합니다. 이러한 통찰력을 바탕으로 FairT2V는 앵커 기반 구면 기하학적 변환을 통해 프롬프트 임베딩을 중립화하여 편향을 완화하는 동시에 의미를 보존합니다. 시간적 일관성을 유지하기 위해, FairT2V는 동적 디노이징 스케줄을 통해 초기 정체성 형성 단계에서만 편향 완화를 적용합니다. 또한, VideoLLM 기반 추론과 인간 검증을 결합한 비디오 수준의 공정성 평가 프로토콜을 제안합니다. 최신 T2V 모델인 Open-Sora에 대한 실험 결과, FairT2V는 비디오 품질에 미치는 영향은 최소화하면서 직업군 전반에 걸쳐 상당한 수준의 인구 통계적 편향을 줄이는 것을 보여줍니다.

Original Abstract

Text-to-video (T2V) diffusion models have achieved rapid progress, yet their demographic biases, particularly gender bias, remain largely unexplored. We present FairT2V, a training-free debiasing framework for text-to-video generation that mitigates encoder-induced bias without finetuning. We first analyze demographic bias in T2V models and show that it primarily originates from pretrained text encoders, which encode implicit gender associations even for neutral prompts. We quantify this effect with a gender-leaning score that correlates with bias in generated videos. Based on this insight, FairT2V mitigates demographic bias by neutralizing prompt embeddings via anchor-based spherical geodesic transformations while preserving semantics. To maintain temporal coherence, we apply debiasing only during early identity-forming steps through a dynamic denoising schedule. We further propose a video-level fairness evaluation protocol combining VideoLLM-based reasoning with human verification. Experiments on the modern T2V model Open-Sora show that FairT2V substantially reduces demographic bias across occupations with minimal impact on video quality.

1 Citations

0 Influential

12.5 Altmetric

63.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!