2602.07311v1 Feb 07, 2026 cs.CV

LUCID-SAE: 해석 가능한 개념 발견을 위한 통합된 시각-언어 희소 코드 학습

LUCID-SAE: Learning Unified Vision-Language Sparse Codes for Interpretable Concept Discovery

Difei Gu

Citations: 29

h-index: 2

Yunhe Gao

Citations: 1,635

h-index: 17

Gerasimos Chatzoudis

Citations: 0

h-index: 0

Zihan Dong

Citations: 58

h-index: 4

Bangwei Guo

Citations: 3

h-index: 1

Yang Zhou

Citations: 17

h-index: 2

Mu Zhou

Citations: 37

h-index: 3

Dimitris N. Metaxas

Citations: 64

h-index: 4

Guoning Zhang

Citations: 21

h-index: 2

희소 오토인코더(SAE)는 다양한 표현 공간에서 비교 가능한 설명을 제공하는 자연스러운 방법을 제공합니다. 그러나 현재의 SAE는 각 모달리티별로 훈련되어 특징이 직접적으로 이해하기 어렵고, 설명이 다른 도메인으로 이전되지 않는 문제를 가지고 있습니다. 본 연구에서는 LUCID(Learning Unified vision-language sparse Codes for Interpretable concept Discovery)를 제안합니다. LUCID는 이미지 패치와 텍스트 토큰 표현을 위한 공유된 잠재적 사전을 학습하는 통합된 시각-언어 희소 오토인코더이며, 동시에 각 모달리티의 특정 세부 정보를 위한 별도의 공간을 확보합니다. 우리는 레이블이 필요 없는 최적의 전송 매칭 목표를 사용하여 공유된 코드를 결합함으로써 특징 정렬을 달성합니다. LUCID는 해석 가능한 공유된 특징을 제공하며, 이는 패치 수준의 연결, 시각-언어 간 뉴런 대응 관계 설정, 그리고 유사성 기반 평가에서의 개념 클러스터링 문제에 대한 강건성을 향상시킵니다. 정렬 특성을 활용하여, 수동적인 관찰 없이 용어 클러스터링을 기반으로 하는 자동화된 사전 해석 파이프라인을 개발했습니다. 우리의 분석 결과, LUCID의 공유된 특징은 객체뿐만 아니라 행동, 속성 및 추상적인 개념을 포함한 다양한 의미 범주를 포착하며, 이는 해석 가능한 다중 모달 표현에 대한 포괄적인 접근 방식을 보여줍니다.

Original Abstract

Sparse autoencoders (SAEs) offer a natural path toward comparable explanations across different representation spaces. However, current SAEs are trained per modality, producing dictionaries whose features are not directly understandable and whose explanations do not transfer across domains. In this study, we introduce LUCID (Learning Unified vision-language sparse Codes for Interpretable concept Discovery), a unified vision-language sparse autoencoder that learns a shared latent dictionary for image patch and text token representations, while reserving private capacity for modality-specific details. We achieve feature alignment by coupling the shared codes with a learned optimal transport matching objective without the need of labeling. LUCID yields interpretable shared features that support patch-level grounding, establish cross-modal neuron correspondence, and enhance robustness against the concept clustering problem in similarity-based evaluation. Leveraging the alignment properties, we develop an automated dictionary interpretation pipeline based on term clustering without manual observations. Our analysis reveals that LUCID's shared features capture diverse semantic categories beyond objects, including actions, attributes, and abstract concepts, demonstrating a comprehensive approach to interpretable multimodal representations.

0 Citations

0 Influential

8.5 Altmetric

42.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!