2601.02415v1 Jan 03, 2026 cs.CV

다중 채널 및 상호 보완적 특징 융합 기반의 다중 모드 감성 분석

Multimodal Sentiment Analysis based on Multi-channel and Symmetric Mutual Promotion Feature Fusion

Citations: 90

h-index: 6

Citations: 158

h-index: 3

다중 모드 감성 분석은 인간-컴퓨터 상호 작용 및 감정 컴퓨팅 분야에서 핵심 기술입니다. 인간의 감정 상태를 정확하게 인식하는 것은 인간과 기계 간의 원활한 소통을 촉진하는 데 매우 중요합니다. 다중 모드 감성 분석 연구에서 일부 진전이 있었지만, 여전히 많은 과제가 남아 있습니다. 첫 번째 과제는 단일 모드 데이터에서 추출되는 특징의 제한성과 부족함입니다. 둘째, 대부분의 연구는 모드 간 특징 정보의 일관성에만 초점을 맞추고, 특징 간의 차이점을 간과하여 적절한 특징 정보 융합이 이루어지지 않습니다. 본 논문에서는 먼저 다중 채널 특징을 추출하여 보다 포괄적인 특징 정보를 얻습니다. 시각 및 청각 모드에서 이중 채널 특징을 사용하여 모드 내 특징 표현을 향상시킵니다. 또한, 상호 보완적 특징 융합(SMP, Symmetric Mutual Promotion) 방법을 제안합니다. 이 방법은 대칭적 교차 모드 어텐션 메커니즘과 자체 어텐션 메커니즘을 결합하며, 교차 모드 어텐션 메커니즘은 다른 모드에서 유용한 정보를 추출하고, 자체 어텐션 메커니즘은 문맥 정보를 모델링합니다. 이러한 접근 방식은 모드 간 유용한 정보 교환을 촉진하여 모드 간 상호 작용을 강화합니다. 또한, 모드 내 특징과 모드 간 융합된 특징을 통합하여 모드 간 특징 정보의 상호 보완성을 최대한 활용하고, 동시에 특징 정보의 차이점을 고려합니다. 두 개의 표준 데이터 세트에 대한 실험 결과는 제안된 방법의 효과성과 우수성을 입증합니다.

Original Abstract

Multimodal sentiment analysis is a key technology in the fields of human-computer interaction and affective computing. Accurately recognizing human emotional states is crucial for facilitating smooth communication between humans and machines. Despite some progress in multimodal sentiment analysis research, numerous challenges remain. The first challenge is the limited and insufficiently rich features extracted from single modality data. Secondly, most studies focus only on the consistency of inter-modal feature information, neglecting the differences between features, resulting in inadequate feature information fusion. In this paper, we first extract multi-channel features to obtain more comprehensive feature information. We employ dual-channel features in both the visual and auditory modalities to enhance intra-modal feature representation. Secondly, we propose a symmetric mutual promotion (SMP) inter-modal feature fusion method. This method combines symmetric cross-modal attention mechanisms and self-attention mechanisms, where the cross-modal attention mechanism captures useful information from other modalities, and the self-attention mechanism models contextual information. This approach promotes the exchange of useful information between modalities, thereby strengthening inter-modal interactions. Furthermore, we integrate intra-modal features and inter-modal fused features, fully leveraging the complementarity of inter-modal feature information while considering feature information differences. Experiments conducted on two benchmark datasets demonstrate the effectiveness and superiority of our proposed method.

0 Citations

0 Influential

3 Altmetric

15.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!