2601.10168v1 Jan 15, 2026 cs.CV

RAG-3DSG: 재촬영 기반 가이드 Retrieval-Augmented Generation을 활용한 3D 장면 그래프 개선

RAG-3DSG: Enhancing 3D Scene Graphs with Re-Shot Guided Retrieval-Augmented Generation

Sihong Xie

Citations: 3

h-index: 1

Yue Chang

Citations: 37

h-index: 4

Rufeng Chen

Citations: 17

h-index: 2

Zhaofan Zhang

Citations: 26

h-index: 2

Yi Chen

Citations: 4

h-index: 2

오픈 보카불러리 기반 3D 장면 그래프(3DSG) 생성은 로봇 공학의 다양한 하위 작업, 예를 들어 조작 및 탐색에 구조화된 의미 표현을 활용하여 성능을 향상시킬 수 있습니다. 3DSG는 장면의 여러 이미지로부터 구성되며, 객체는 노드로, 관계는 엣지로 표현됩니다. 그러나 기존의 오픈 보카불러리 3DSG 생성 방법은 제한된 시야, 가려짐, 그리고 불필요한 표면 밀도 등으로 인해 객체 수준의 낮은 인식 정확도와 속도 문제를 겪습니다. 이러한 문제점을 해결하기 위해, 우리는 재촬영 기반 가이드 불확실성 추정을 통해 데이터 통합 과정에서 발생하는 노이즈를 줄이고, 신뢰성 있는 낮은 불확실성을 가진 객체를 활용하여 객체 수준의 Retrieval-Augmented Generation(RAG)을 지원하는 RAG-3DSG를 제안합니다. 또한, 우리는 적응적인 세분성을 가진 객체 간의 이미지 매핑 속도를 가속화하기 위한 동적 다운샘플링-매핑 전략을 제안합니다. Replica 데이터셋에 대한 실험 결과, RAG-3DSG는 3DSG 생성 과정에서 노드 캡셔닝 정확도를 크게 향상시키는 동시에, 일반적인 방법과 비교하여 매핑 시간을 약 2/3로 줄이는 것을 보여줍니다.

Original Abstract

Open-vocabulary 3D Scene Graph (3DSG) generation can enhance various downstream tasks in robotics, such as manipulation and navigation, by leveraging structured semantic representations. A 3DSG is constructed from multiple images of a scene, where objects are represented as nodes and relationships as edges. However, existing works for open-vocabulary 3DSG generation suffer from both low object-level recognition accuracy and speed, mainly due to constrained viewpoints, occlusions, and redundant surface density. To address these challenges, we propose RAG-3DSG to mitigate aggregation noise through re-shot guided uncertainty estimation and support object-level Retrieval-Augmented Generation (RAG) via reliable low-uncertainty objects. Furthermore, we propose a dynamic downsample-mapping strategy to accelerate cross-image object aggregation with adaptive granularity. Experiments on Replica dataset demonstrate that RAG-3DSG significantly improves node captioning accuracy in 3DSG generation while reducing the mapping time by two-thirds compared to the vanilla version.

2 Citations

0 Influential

2 Altmetric

12.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!