2602.17124v1 Feb 19, 2026 cs.CV

멀티모달 가우시안 스플래팅을 활용한 3D 장면 렌더링

3D Scene Rendering with Multimodal Gaussian Splatting

Chi-Shiang Gau

Citations: 26

h-index: 3

Konstantinos D. Polyzos

Citations: 214

h-index: 9

Athanasios Bacharis

Citations: 23

h-index: 3

Saketh Madhuvarasu

Citations: 7

h-index: 1

Tara Javidi

Citations: 6

h-index: 1

3D 장면 재구성 및 렌더링은 컴퓨터 비전의 핵심 작업으로, 산업 모니터링, 로봇 공학, 자율 주행 등 광범위한 응용 분야를 갖는다. 3D 가우시안 스플래팅(GS)과 그 변형 기술의 최근 발전은 높은 연산 및 메모리 효율성을 유지하면서도 뛰어난 렌더링 충실도를 달성했다. 그러나 기존의 비전 기반 GS 파이프라인은 일반적으로 가우시안 기본 요소(primitives)를 초기화하고 매개변수를 학습하기 위해 충분한 수의 카메라 뷰에 의존한다. 이로 인해 초기화 과정에서 추가적인 처리 비용이 발생하며, 악천후, 저조도 또는 부분적인 가려짐과 같이 시각적 단서를 신뢰할 수 없는 환경에서는 성능이 저하된다. 이러한 한계를 극복하기 위해, 본 논문은 날씨, 조명, 가려짐에 대한 무선 주파수(RF) 신호의 강건성에 착안하여, 차량용 레이더와 같은 RF 센싱과 GS 기반 렌더링을 통합한 멀티모달 프레임워크를 제안한다. 이는 비전 전용 GS 렌더링에 대한 더 효율적이고 강건한 대안이 된다. 제안된 접근 방식은 희소한 RF 기반 깊이 측정값만으로도 효율적인 깊이 예측을 가능하게 하며, 이를 통해 다양한 GS 아키텍처의 가우시안 함수 초기화에 필요한 고품질 3D 포인트 클라우드를 생성한다. 수치 실험 결과, RF 센싱을 GS 파이프라인에 적절히 통합함으로써 얻을 수 있는 이점을 확인하였으며, RF 정보가 제공하는 구조적 정확성을 바탕으로 고충실도의 3D 장면 렌더링을 달성함을 입증하였다.

Original Abstract

3D scene reconstruction and rendering are core tasks in computer vision, with applications spanning industrial monitoring, robotics, and autonomous driving. Recent advances in 3D Gaussian Splatting (GS) and its variants have achieved impressive rendering fidelity while maintaining high computational and memory efficiency. However, conventional vision-based GS pipelines typically rely on a sufficient number of camera views to initialize the Gaussian primitives and train their parameters, typically incurring additional processing cost during initialization while falling short in conditions where visual cues are unreliable, such as adverse weather, low illumination, or partial occlusions. To cope with these challenges, and motivated by the robustness of radio-frequency (RF) signals to weather, lighting, and occlusions, we introduce a multimodal framework that integrates RF sensing, such as automotive radar, with GS-based rendering as a more efficient and robust alternative to vision-only GS rendering. The proposed approach enables efficient depth prediction from only sparse RF-based depth measurements, yielding a high-quality 3D point cloud for initializing Gaussian functions across diverse GS architectures. Numerical tests demonstrate the merits of judiciously incorporating RF sensing into GS pipelines, achieving high-fidelity 3D scene rendering driven by RF-informed structural accuracy.

0 Citations

0 Influential

4.5 Altmetric

22.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!