2604.06566v1 Apr 08, 2026 cs.DB

AI 기반 데이터베이스 연구

AI-Driven Research for Databases

Audrey Cheng

Citations: 120

h-index: 4

Harald Ng

Citations: 16

h-index: 1

Aaron N. Kabcenell

Citations: 17

h-index: 2

Peter Bailis

Citations: 146

h-index: 4

Matei Zaharia

Citations: 5

h-index: 1

Xiao Shi

Citations: 137

h-index: 7

Ion Stoica

Citations: 23

h-index: 3

Lin Ma

Citations: 693

h-index: 11

현대 워크로드와 하드웨어의 복잡성이 인간의 연구 및 엔지니어링 역량을 점점 더 능가함에 따라, 기존의 데이터베이스 성능 최적화 방법은 이러한 속도를 따라가지 못하고 있습니다. 이러한 격차를 해소하기 위해, AI 기반 시스템 연구(ADRS)라는 새로운 기술 분야가 등장했습니다. ADRS는 대규모 언어 모델을 사용하여 솔루션 발견을 자동화합니다. 이 접근 방식은 최적화를 수동 시스템 설계에서 자동 코드 생성으로 전환합니다. 그러나 ADRS를 적용하는 데 있어 가장 큰 장애물은 평가 파이프라인입니다. 이러한 프레임워크는 인간의 감독 없이 수백 개의 후보를 빠르게 생성하므로, 효과적인 솔루션을 찾기 위해서는 빠르고 정확한 평가자로부터의 피드백이 필수적입니다. 특히 복잡한 데이터베이스 시스템의 경우, 이러한 평가자를 구축하는 것이 매우 어렵습니다. 본 연구에서는 ADRS를 이 분야에서 실용적으로 적용할 수 있도록, 평가자를 솔루션과 함께 공동 진화시켜 평가자 설계를 자동화하는 방법을 제안합니다. 버퍼 관리, 쿼리 재작성 및 인덱스 선택을 최적화하는 세 가지 사례 연구를 통해 이 접근 방식의 효과를 입증했습니다. 저희의 자동화된 평가자는 기존 최고 성능 모델보다 뛰어난 새로운 알고리즘을 발견하는 데 기여했으며 (예: 최대 6.8배 낮은 지연 시간을 달성하는 결정론적 쿼리 재작성 정책), 이는 평가 병목 현상을 해결함으로써 ADRS가 차세대 데이터 시스템을 위한 고도로 최적화된, 배포 가능한 코드를 생성할 수 있는 잠재력을 발휘할 수 있음을 보여줍니다.

Original Abstract

As the complexity of modern workloads and hardware increasingly outpaces human research and engineering capacity, existing methods for database performance optimization struggle to keep pace. To address this gap, a new class of techniques, termed AI-Driven Research for Systems (ADRS), uses large language models to automate solution discovery. This approach shifts optimization from manual system design to automated code generation. The key obstacle, however, in applying ADRS is the evaluation pipeline. Since these frameworks rapidly generate hundreds of candidates without human supervision, they depend on fast and accurate feedback from evaluators to converge on effective solutions. Building such evaluators is especially difficult for complex database systems. To enable the practical application of ADRS in this domain, we propose automating the design of evaluators by co-evolving them with the solutions. We demonstrate the effectiveness of this approach through three case studies optimizing buffer management, query rewriting, and index selection. Our automated evaluators enable the discovery of novel algorithms that outperform state-of-the-art baselines (e.g., a deterministic query rewrite policy that achieves up to 6.8x lower latency), demonstrating that addressing the evaluation bottleneck unlocks the potential of ADRS to generate highly optimized, deployable code for next-generation data systems.

2 Citations

1 Influential

5.5 Altmetric

31.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!