2606.16497v1 Jun 15, 2026 cs.LG

daVinci-kernel: Co-Evolving Skill Selection, Summarization, and Utilization via RL for GPU Kernel Optimization

Mohan Jiang
Mohan Jiang
Citations: 34
h-index: 4
Dayuan Fu
Dayuan Fu
Citations: 261
h-index: 4
Jinlong Hou
Jinlong Hou
Citations: 45
h-index: 4
Jiarui Hu
Jiarui Hu
Citations: 6
h-index: 1
Liming Liu
Liming Liu
Citations: 57
h-index: 4
Pengfei Li
Pengfei Li
Citations: 53
h-index: 3
Tong Wang
Tong Wang
Citations: 33
h-index: 4
Dian Yang
Dian Yang
Citations: 0
h-index: 0

GPU kernel optimization represents a paradigm where functional correctness is assumed and execution efficiency is the objective. We present daVinci-kernel, a reinforcement learning framework that couples skill discovery with skill exploitation through a dynamically evolving skill library. daVinci-kernel jointly trains three agents sharing one LLM backbone: a Skill Selection Agent that retrieves relevant techniques via BM25 and LLM reranking, a Policy Agent that generates multi-turn CUDA/Triton kernels conditioned on selected skills, and a Skill Summary Agent that distills successful rollouts into reusable skills. Candidate skills are added only after execution-based verification confirms reproducible speedups. All three agents share a single LLM backbone, are initialized via a structured SFT cold start on diversity-filtered data, and are then jointly optimized end-to-end with multi-turn REINFORCE and per-agent advantage estimation. On KernelBench, daVinci-kernel-14B achieves 37.2%, 70.6%, and 32.2% on Level 1, Level 2, and Level 3 under the Fast$_1$ threshold, outperforming the strongest prior RL-trained model, Dr.Kernel-14B.

0 Citations
0 Influential
2 Altmetric
10.0 Score
Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

Log in to request an AI analysis.

댓글

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!