2602.22740v1 Feb 26, 2026 cs.CV

AMLRIS: 정렬(alignment) 인지 마스킹 학습을 활용한 객체 지칭 이미지 분할

AMLRIS: Alignment-aware Masked Learning for Referring Image Segmentation

Shuo Yang

Citations: 9

h-index: 2

Tongfei Chen

Citations: 9

h-index: 2

Linlin Yang

Citations: 7

h-index: 1

Runtang Guo

Citations: 6

h-index: 1

He Long

Citations: 13

h-index: 1

Chunyu Xie

Citations: 162

h-index: 6

D. Leng

Citations: 457

h-index: 12

Changbai Li

Citations: 18

h-index: 2

Baochang Zhang

Citations: 14

h-index: 2

Yuguang Yang

Citations: 888

h-index: 11

객체 지칭 이미지 분할(RIS)은 자연어 표현으로 지정된 이미지 내 객체를 분할하는 것을 목표로 합니다. 본 논문에서는 정렬 인지 마스킹 학습(AML)이라는 훈련 전략을 소개합니다. AML은 픽셀 수준의 시각-언어 정렬을 명시적으로 추정하고, 최적화 과정에서 정렬이 제대로 이루어지지 않은 영역을 제거하며, 신뢰할 수 있는 정보에 집중하여 RIS 성능을 향상시킵니다. 이러한 접근 방식은 RefCOCO 데이터셋에서 최첨단 성능을 달성했으며, 다양한 설명 및 시나리오에 대한 강건성도 향상시킵니다.

Original Abstract

Referring Image Segmentation (RIS) aims to segment an object in an image identified by a natural language expression. The paper introduces Alignment-Aware Masked Learning (AML), a training strategy to enhance RIS by explicitly estimating pixel-level vision-language alignment, filtering out poorly aligned regions during optimization, and focusing on trustworthy cues. This approach results in state-of-the-art performance on RefCOCO datasets and also enhances robustness to diverse descriptions and scenarios

0 Citations

0 Influential

6 Altmetric

30.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!