2602.16719v1 Feb 10, 2026 cs.DB

그래프 기반 벡터 검색을 위한 GPU 가속 알고리즘: 분류, 실험적 연구, 및 연구 방향

GPU-Accelerated Algorithms for Graph Vector Search: Taxonomy, Empirical Study, and Research Directions

Anxin Tian

Citations: 161

h-index: 3

Yaowen Liu

Citations: 14

h-index: 2

Xuejia Chen

Citations: 150

h-index: 2

Haoyang Li

Citations: 1

h-index: 1

Qinbin Li

Citations: 98

h-index: 2

Alexander Zhou

Citations: 9

h-index: 1

C. Zhang

Citations: 44

h-index: 4

Qing Li

Citations: 151

h-index: 2

Xin Zhang

Citations: 37

h-index: 2

Lei Chen

Citations: 49

h-index: 4

근사 최근접 이웃 탐색(ANNS)은 대규모 데이터 마이닝 및 머신러닝 애플리케이션의 기반이 되며, 데이터 세트 크기가 증가함에 따라 효율적인 검색은 GPU 가속에 점점 더 의존하게 됩니다. 그래프 기반 접근 방식은 근사 최근접 이웃 탐색 분야에서 최첨단을 차지하지만, 최신 GPU 아키텍처에 대한 최적화 및 실제 시나리오에서의 전체적인 효율성에 대한 체계적인 이해는 부족합니다. 본 연구에서는 GPU 가속 그래프 기반 벡터 검색 알고리즘에 대한 종합적인 조사 및 실험적 연구를 제시합니다. GPU 최적화 전략에 대한 상세한 분류를 확립하고, 알고리즘 작업과 GPU 내 하드웨어 실행 장치 간의 관계를 명확히 합니다. 8개의 대규모 벤치마크 데이터 세트에 대해 6개의 주요 알고리즘을 평가하여 그래프 인덱스 생성 및 쿼리 검색 성능을 모두 평가했습니다. 분석 결과, 거리 계산이 주요 계산 병목 지점이며, 호스트 CPU와 GPU 간의 데이터 전송이 대규모 환경에서 실제 지연 시간에 가장 큰 영향을 미치는 요인으로 나타났습니다. 또한, 다양한 시스템 설계에서 확장성과 메모리 사용량 간의 주요 트레이드오프를 강조합니다. 본 연구의 결과는 확장 가능하고 강력한 GPU 기반 근사 최근접 이웃 탐색 시스템을 설계하기 위한 명확한 지침을 제공하며, 지식 발견 및 데이터 마이닝 커뮤니티를 위한 종합적인 벤치마크를 제공합니다.

Original Abstract

Approximate Nearest Neighbor Search (ANNS) underpins many large-scale data mining and machine learning applications, with efficient retrieval increasingly hinging on GPU acceleration as dataset sizes grow. Although graph-based approaches represent the state of the art in approximate nearest neighbor search, there is a lack of systematic understanding regarding their optimization for modern GPU architectures and their end-to-end effectiveness in practical scenarios. In this work, we present a comprehensive survey and experimental study of GPU-accelerated graph-based vector search algorithms. We establish a detailed taxonomy of GPU optimization strategies and clarify the mapping between algorithmic tasks and hardware execution units within GPUs. Through a thorough evaluation of six leading algorithms on eight large-scale benchmark datasets, we assess both graph index construction and query search performance. Our analysis reveals that distance computation remains the primary computational bottleneck, while data transfer between the host CPU and GPU emerges as the dominant factor influencing real-world latency at large scale. We also highlight key trade-offs in scalability and memory usage across different system designs. Our findings offer clear guidelines for designing scalable and robust GPU-powered approximate nearest neighbor search systems, and provide a comprehensive benchmark for the knowledge discovery and data mining community.

1 Citations

0 Influential

2 Altmetric

11.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!