2601.09248v1 Jan 14, 2026 cs.CV

시각적 장소 인식을 위한 하이브리드 가이드 변분 오토인코더

Hybrid guided variational autoencoder for visual place recognition

Zihan You

Citations: 17

h-index: 3

Ni Wang

Citations: 13

h-index: 2

Emre Neftci

Citations: 2

h-index: 1

Thorben Schoepe

Citations: 126

h-index: 6

자율 주행 자동차, 로봇, 드론과 같은 자율 시스템은 GPS 신호가 제한된 실내 환경을 포함하여 다양한 환경에서 정확하게 자신의 위치를 파악해야 합니다. 정확한 위치 파악을 위한 한 가지 방법은 시각적 장소 인식(VPR)으로, 이는 이전에 관찰된 장소를 기반으로 이미지의 위치를 추정합니다. 최첨단 VPR 모델은 많은 메모리를 필요로 하여 모바일 환경에 적용하기 어렵고, 더 작고 간결한 모델은 견고성 및 일반화 능력이 부족합니다. 본 연구에서는 이벤트 기반 비전 센서와 이벤트 기반의 새로운 가이드 변분 오토인코더(VAE)를 결합하여 이러한 제한 사항을 극복하고 로봇 공학에 적용했습니다. 본 모델의 인코더 부분은 전력 효율적인 저지연 뉴로모픽 하드웨어와 호환되는 스파이킹 신경망 모델을 기반으로 합니다. 본 VAE는 새로운 실내 VPR 데이터셋에서 16개의 서로 다른 장소의 시각적 특징을 성공적으로 분리했으며, 다른 최첨단 접근 방식과 비교 가능한 분류 성능을 보입니다. 또한 다양한 조명 조건에서도 견고한 성능을 나타냅니다. 본 모델은 알려지지 않은 장면의 새로운 시각적 입력에 대해 테스트했을 때, 이러한 장소들을 구별할 수 있으며, 이는 위치의 필수적인 특징을 학습함으로써 높은 일반화 능력을 보여줍니다. 본 연구에서 제안하는 작고 견고하며 일반화 능력이 뛰어난 가이드 VAE는 시각적 장소 인식에 유망한 모델이며, 알려진 및 알려지지 않은 실내 환경에서 모바일 로봇의 탐색 성능을 크게 향상시킬 수 있습니다.

Original Abstract

Autonomous agents such as cars, robots and drones need to precisely localize themselves in diverse environments, including in GPS-denied indoor environments. One approach for precise localization is visual place recognition (VPR), which estimates the place of an image based on previously seen places. State-of-the-art VPR models require high amounts of memory, making them unwieldy for mobile deployment, while more compact models lack robustness and generalization capabilities. This work overcomes these limitations for robotics using a combination of event-based vision sensors and an event-based novel guided variational autoencoder (VAE). The encoder part of our model is based on a spiking neural network model which is compatible with power-efficient low latency neuromorphic hardware. The VAE successfully disentangles the visual features of 16 distinct places in our new indoor VPR dataset with a classification performance comparable to other state-of-the-art approaches while, showing robust performance also under various illumination conditions. When tested with novel visual inputs from unknown scenes, our model can distinguish between these places, which demonstrates a high generalization capability by learning the essential features of location. Our compact and robust guided VAE with generalization capabilities poses a promising model for visual place recognition that can significantly enhance mobile robot navigation in known and unknown indoor environments.

0 Citations

0 Influential

3 Altmetric

15.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!