2603.12721v1 Mar 13, 2026 cs.CV

CMHANet: 포인트 클라우드 정렬을 위한 크로스 모달 하이브리드 어텐션 네트워크

CMHANet: A Cross-Modal Hybrid Attention Network for Point Cloud Registration

Yiding Sun

Citations: 51

h-index: 5

Jihua Zhu

Citations: 28

h-index: 3

Dongxu Zhang

Citations: 65

h-index: 6

Yingsen Wang

Citations: 6

h-index: 1

Haoran Xu

Citations: 7

h-index: 1

Peilin Fan

Citations: 13

h-index: 2

강력한 포인트 클라우드 정렬은 3D 컴퓨터 비전 및 지오메트릭 딥러닝의 기본적인 과제로, 대규모 3D 재구성, 증강 현실 및 장면 이해와 같은 응용 분야에 필수적입니다. 그러나 기존의 학습 기반 방법은 불완전한 데이터, 센서 노이즈 및 낮은 중복 영역으로 특징지어지는 복잡한 실제 환경에서 성능이 저하되는 경우가 많습니다. 이러한 제한 사항을 해결하기 위해, 우리는 새로운 크로스 모달 하이브리드 어텐션 네트워크인 CMHANet을 제안합니다. 우리의 방법은 2D 이미지에서 풍부한 문맥 정보를 3D 포인트 클라우드의 기하학적 세부 정보와 융합하여 포괄적이고 강력한 특징 표현을 제공합니다. 또한, 우리는 대비 학습을 기반으로 하는 혁신적인 최적화 함수를 도입하여 기하학적 일관성을 강화하고 모델의 노이즈 및 부분적인 관찰에 대한 견고성을 크게 향상시킵니다. 우리는 CMHANet을 3DMatch 및 어려운 3DLoMatch 데이터 세트에서 평가했습니다. ev{또한, TUM RGB-D SLAM 데이터 세트에 대한 제로샷 평가는 모델의 일반화 능력을 검증합니다.} 실험 결과는 우리의 방법이 등록 정확도 및 전반적인 견고성 모두에서 상당한 개선을 달성하여 현재 기술보다 우수하다는 것을 보여줍니다. 또한, 저희는 코드를 다음 링크에서 제공합니다: [https://github.com/DongXu-Zhang/CMHANet](https://github.com/DongXu-Zhang/CMHANet)

Original Abstract

Robust point cloud registration is a fundamental task in 3D computer vision and geometric deep learning, essential for applications such as large-scale 3D reconstruction, augmented reality, and scene understanding. However, the performance of established learning-based methods often degrades in complex, real world scenarios characterized by incomplete data, sensor noise, and low overlap regions. To address these limitations, we propose CMHANet, a novel Cross-Modal Hybrid Attention Network. Our method integrates the fusion of rich contextual information from 2D images with the geometric detail of 3D point clouds, yielding a comprehensive and resilient feature representation. Furthermore, we introduce an innovative optimization function based on contrastive learning, which enforces geometric consistency and significantly improves the model's robustness to noise and partial observations. We evaluated CMHANet on the 3DMatch and the challenging 3DLoMatch datasets. \rev{Additionally, zero-shot evaluations on the TUM RGB-D SLAM dataset verify the model's generalization capability to unseen domains.} The experimental results demonstrate that our method achieves substantial improvements in both registration accuracy and overall robustness, outperforming current techniques. We also release our code in \href{https://github.com/DongXu-Zhang/CMHANet}{https://github.com/DongXu-Zhang/CMHANet}.

6 Citations

0 Influential

29.931471805599 Altmetric

155.7 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!