2603.12215v1 Mar 12, 2026 cs.CV

RDNet: 광학 원격 감지 이미지에서 영역 비율 인지 동적 적응형 주목 대상 객체 탐지 네트워크

RDNet: Region Proportion-Aware Dynamic Adaptive Salient Object Detection Network in Optical Remote Sensing Images

Bin Wan

Citations: 289

h-index: 9

Runmin Cong

Citations: 954

h-index: 18

Xiaofei Zhou

Citations: 163

h-index: 8

Hao Fang

Shandong University

Citations: 269

h-index: 6

Yaoqi Sun

Citations: 1,506

h-index: 17

S. Kwong

Citations: 1,789

h-index: 24

원격 감지 이미지에서의 주목 대상 객체 탐지(SOD)는 객체의 크기 변화, 자기 주의 메커니즘의 계산 비용, 그리고 CNN 기반 추출기의 전역 맥락 및 장거리 의존성 포착 능력의 한계로 인해 상당한 어려움을 겪습니다. 고정된 컨볼루션 커널에 의존하는 기존 방법은 다양한 객체 크기에 대한 적응력이 떨어져 세부 정보 손실이나 관련 없는 특징 결합을 초래하는 경우가 많습니다. 이러한 문제를 해결하기 위해, 본 연구는 객체 크기 변화에 대한 강건성을 향상시키고 정확한 객체 위치 파악을 목표로 합니다. 본 연구에서는 영역 비율 인지 동적 적응형 주목 대상 객체 탐지 네트워크(RDNet)를 제안합니다. RDNet은 CNN 백본을 SwinTransformer로 대체하여 전역 맥락 모델링을 수행하며, 세 가지 핵심 모듈을 도입합니다. (1) 객체 영역 비율에 따라 다양한 컨볼루션 커널을 적용하는 동적 적응형 세부 정보 인식(DAD) 모듈, (2) 웨이블릿 상호 작용 및 어텐션을 통해 맥락 정보를 풍부하게 하는 주파수 매칭 맥락 강화(FCE) 모듈, (3) 크로스 어텐션을 사용하여 의미 있는 세부 정보를 강조하고, DAD 모듈을 지원하는 비율 가이드(PG) 블록을 포함하는 영역 비율 인지 위치 파악(RPL) 모듈입니다. 이러한 모듈을 결합함으로써, RDNet은 객체 크기 변화에 대한 강건성을 확보하고 정확한 위치 파악을 가능하게 하며, 최첨단 방법과 비교하여 우수한 탐지 성능을 제공합니다.

Original Abstract

Salient object detection (SOD) in remote sensing images faces significant challenges due to large variations in object sizes, the computational cost of self-attention mechanisms, and the limitations of CNN-based extractors in capturing global context and long-range dependencies. Existing methods that rely on fixed convolution kernels often struggle to adapt to diverse object scales, leading to detail loss or irrelevant feature aggregation. To address these issues, this work aims to enhance robustness to scale variations and achieve precise object localization. We propose the Region Proportion-Aware Dynamic Adaptive Salient Object Detection Network (RDNet), which replaces the CNN backbone with the SwinTransformer for global context modeling and introduces three key modules: (1) the Dynamic Adaptive Detail-aware (DAD) module, which applies varied convolution kernels guided by object region proportions; (2) the Frequency-matching Context Enhancement (FCE) module, which enriches contextual information through wavelet interactions and attention; and (3) the Region Proportion-aware Localization (RPL) module, which employs cross-attention to highlight semantic details and integrates a Proportion Guidance (PG) block to assist the DAD module. By combining these modules, RDNet achieves robustness against scale variations and accurate localization, delivering superior detection performance compared with state-of-the-art methods.

4 Citations

0 Influential

12 Altmetric

64.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!