2606.05843v1 Jun 04, 2026 cs.CL

Mechanistic Insights into Functional Sparsity in Multimodal LLMs via CoRe Heads

Ruoxi Sun
Ruoxi Sun
Citations: 7
h-index: 2
Juntao Li
Juntao Li
Citations: 2,739
h-index: 24
Quantong Qiu
Quantong Qiu
Citations: 29
h-index: 3
Yihang Lou
Yihang Lou
Citations: 1,682
h-index: 20
Zecheng Tang
Zecheng Tang
Citations: 743
h-index: 2
Min Zhang
Min Zhang
Citations: 167
h-index: 7

While Multimodal Large Language Models (MLLMs) demonstrate remarkable proficiency on complex vision-language tasks, the mechanisms by which they extract query-relevant visual features from complex, noisy contexts remain opaque. In this paper, we present an in-depth interpretability study that uncovers a profound structural property within MLLMs: functional sparsity in cross-modal retrieval. Leveraging a token-level metric termed Retrieval Attention Mass (RAM), we identify and characterize a highly specialized subset of attention heads, referred to as Context-aware Retrieval (CoRe) heads. Across diverse visual domains and model scales, we observe a clear functional division: CoRe heads act as dedicated information extractors, while most other heads distribute attention over broader contextual regions. Causal interventions further demonstrate the necessity of these specialized heads. Ablating only the top 5% of CoRe heads causes significant degradation in multimodal reasoning performance, whereas ablating lower-ranked heads has minimal effect. Moreover, acceleration experiments validate the utility of CoRe heads, showing that leveraging this localized sparsity significantly accelerates inference while maintaining robust task performance. Our findings reveal a structural principle of functional sparsity within MLLMs, refining the current understanding of mechanistic interpretability and laying a theoretical foundation that can inspire future architecture design and model optimization.

0 Citations
0 Influential
12 Altmetric
60.0 Score
Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

Log in to request an AI analysis.

댓글

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!