2603.19173v1 Mar 19, 2026 cs.LG

SOL-ExecBench: 실세계 GPU 커널의 하드웨어 제한에 대한 광속 벤치마킹

SOL-ExecBench: Speed-of-Light Benchmarking for Real-World GPU Kernels Against Hardware Limits

Eric Chung

Citations: 359

h-index: 8

Zihao Ye

Citations: 18

h-index: 3

Tianqi Chen

Citations: 772

h-index: 5

Sahil Modi

Citations: 240

h-index: 7

Edward Lin

Citations: 88

h-index: 4

S. Hari

Citations: 19

h-index: 3

Christos Kozyrakis

Citations: 11

h-index: 2

S. Damani

Citations: 143

h-index: 7

Wei Liu

Citations: 247

h-index: 3

Fengzhe Zhou

Citations: 38

h-index: 3

Vinod Grover

Citations: 241

h-index: 3

Roger A. Bringmann

Citations: 1,477

h-index: 11

L. Ceze

Citations: 279

h-index: 4

Humphrey Shi

Citations: 63

h-index: 4

Qijing Huang

Citations: 837

h-index: 11

Zhifan Ye

Citations: 409

h-index: 10

N. Qin

Citations: 12

h-index: 2

Dheeraj Peri

Citations: 97

h-index: 5

Ouye Xie

Citations: 79

h-index: 3

Aditya Kane

Citations: 65

h-index: 3

Moshe Maor

Citations: 34

h-index: 3

M. Behar

Citations: 86

h-index: 5

Triston Cao

Citations: 10

h-index: 2

Rishabh Mehta

Citations: 292

h-index: 4

Vartika Singh

Citations: 210

h-index: 3

Vikram Sharma Mailthody

Citations: 3

h-index: 1

Te-Wei Chen

Citations: 7

h-index: 2

Hanfeng Chen

Citations: 12

h-index: 3

Wei Chen

Citations: 283

h-index: 3

C. Zeller

Citations: 685

h-index: 9

M.F. Lightstone

Citations: 16

h-index: 3

Yuan Zhang

Citations: 38

h-index: 3

Jingquan Wang

Citations: 17

h-index: 2

에이전트 기반 AI 시스템이 GPU 커널을 생성하고 최적화하는 능력이 향상됨에 따라, 기존 벤치마크는 소프트웨어 기준 대비 성능 향상을 우선시하여 하드웨어 효율적인 실행에 도달하는 것을 제약합니다. 본 논문에서는 NVIDIA Blackwell GPU를 대상으로 하는 124개의 기존 및 신규 AI 모델(텍스트, 확산 모델, 비전, 오디오, 비디오 및 하이브리드 아키텍처 포함)에서 추출된 235개의 CUDA 커널 최적화 문제를 포함하는 벤치마크인 SOL-ExecBench를 소개합니다. 이 벤치마크는 BF16, FP8 및 NVFP4를 포함한 순방향 및 역방향 워크로드를 다루며, 최적의 성능이 Blackwell GPU의 특정 기능에 의존할 것으로 예상되는 커널을 포함합니다. 기존 벤치마크가 커널을 주로 소프트웨어 구현과 비교하여 평가하는 것과 달리, SOL-ExecBench는 분석적으로 도출된 광속(SOL) 경계를 측정합니다. 이 경계는 SOLAR 파이프라인을 사용하여 하드웨어 기반 SOL 경계를 계산하며, 이를 통해 하드웨어 효율적인 최적화를 위한 고정된 목표를 제공합니다. SOL Score는 릴리스 버전의 기준 성능과 하드웨어 SOL 경계 사이의 간극을 후보 커널이 얼마나 좁히는지 정량화합니다. 에이전트 기반 최적화기의 견고한 평가를 지원하기 위해, GPU 클럭 잠금, L2 캐시 초기화, 격리된 서브프로세스 실행 및 정적 분석 기반 검사를 포함하는 샌드박스 환경을 추가로 제공하여 일반적인 보상 해킹 전략에 대한 검증을 수행합니다. SOL-ExecBench는 GPU 커널 벤치마킹의 기준을 변경하여, 단순히 변화하는 소프트웨어 기준을 능가하는 것이 아니라, 하드웨어 광속에 남아 있는 간극을 좁히는 데 중점을 둡니다.

Original Abstract

As agentic AI systems become increasingly capable of generating and optimizing GPU kernels, progress is constrained by benchmarks that reward speedup over software baselines rather than proximity to hardware-efficient execution. We present SOL-ExecBench, a benchmark of 235 CUDA kernel optimization problems extracted from 124 production and emerging AI models spanning language, diffusion, vision, audio, video, and hybrid architectures, targeting NVIDIA Blackwell GPUs. The benchmark covers forward and backward workloads across BF16, FP8, and NVFP4, including kernels whose best performance is expected to rely on Blackwell-specific capabilities. Unlike prior benchmarks that evaluate kernels primarily relative to software implementations, SOL-ExecBench measures performance against analytically derived Speed-of-Light (SOL) bounds computed by SOLAR, our pipeline for deriving hardware-grounded SOL bounds, yielding a fixed target for hardware-efficient optimization. We report a SOL Score that quantifies how much of the gap between a release-defined scoring baseline and the hardware SOL bound a candidate kernel closes. To support robust evaluation of agentic optimizers, we additionally provide a sandboxed harness with GPU clock locking, L2 cache clearing, isolated subprocess execution, and static analysis based checks against common reward-hacking strategies. SOL-ExecBench reframes GPU kernel benchmarking from beating a mutable software baseline to closing the remaining gap to hardware Speed-of-Light.

3 Citations

0 Influential

5.5 Altmetric

30.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!