2605.29357v1 May 28, 2026 cs.AI

PassNet: Scaling Large Language Models for Graph Compiler Pass Generation

Jingjing Wu
Jingjing Wu
Citations: 10
h-index: 1
Sijun He
Sijun He
Citations: 9
h-index: 1
Siqi Bao
Siqi Bao
Citations: 18
h-index: 2
Yiqun Liu
Yiqun Liu
Citations: 3
h-index: 1
Enrong Zheng
Enrong Zheng
Citations: 1
h-index: 1
Honglei Qiu
Honglei Qiu
Citations: 0
h-index: 0
Tai Liang
Tai Liang
Citations: 0
h-index: 0
Yuhang Zhou
Yuhang Zhou
Citations: 33
h-index: 4
Yiwei Zhang
Yiwei Zhang
Citations: 41
h-index: 3
Weihan Yi
Weihan Yi
Citations: 0
h-index: 0
Yingsheng Wu
Yingsheng Wu
Citations: 10
h-index: 1
Ruqing Yang
Ruqing Yang
Citations: 0
h-index: 0
Dongyang Chen
Dongyang Chen
Citations: 4
h-index: 2
Xinqi Li
Xinqi Li
Citations: 20
h-index: 2

Modern tensor compilers such as TorchInductor deliver substantial speedups on mainstream models, yet face a systematic performance ceiling on long-tail workloads -- our profiling shows that 43% of real-world subgraphs experience end-to-end slowdowns under default compilation. While LLMs offer a path toward automated optimization, existing efforts focus on standalone kernel generation. We argue that pass generation -- where LLMs author structured graph transformations that integrate directly into compiler pipelines -- is the more appropriate abstraction. We propose PassNet, the first large-scale ecosystem for LLM-based compiler pass generation, comprising: (1) PassNet-Dataset, over 18K unique computational graphs from 100K real-world models; and (2) PassBench, 200 curated long-tail fusible tasks (comprising 2,060 subgraphs in total) evaluated under the Error-aware Speedup Score (ES_t) -- a metric unifying correctness, stability, and performance -- with layered integrity defenses against systematic LLM exploitation. Experiments reveal that PassBench is both highly discriminative and genuinely unsaturated: the best frontier model trails TorchInductor by 37% in aggregate, yet on individual subgraphs LLMs achieve up to 3x speedup over the same compiler -- indicating that the bottleneck is consistency, not capability. Fine-tuning a small model on merely ~4K PassNet trajectories yields a 2.67x improvement approaching frontier-model performance, demonstrating substantial headroom and validating PassNet as live training infrastructure for advancing LLM-driven compiler optimization. All data, benchmarks, and tooling are publicly available.

0 Citations
0 Influential
2 Altmetric
10.0 Score
Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

Log in to request an AI analysis.

댓글

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!