2606.10890v1 Jun 09, 2026 cs.LG

Optimal Post-Training Quantization Scales and Where to Find Them

Nicholas Fraser
Nicholas Fraser
Citations: 10
h-index: 3
Ian Colbert
Ian Colbert
Citations: 71
h-index: 4
Pablo Monteagudo-Lago
Pablo Monteagudo-Lago
Citations: 15
h-index: 3
Giuseppe Franco
Giuseppe Franco
Citations: 19
h-index: 3
J. Amboage
J. Amboage
Citations: 15
h-index: 2

Post-training quantization (PTQ) compresses large language models by mapping weights to low-bit representations. The scaling factor that defines the quantization grid is typically chosen using simple, data-free heuristics. In this work, we present PiSO (Piecewise Scale Optimization), an algorithm that leverages calibration data to compute the optimal channel-wise weight scales exactly and efficiently under round-to-nearest quantization. PiSO partitions the scale search space into finitely many intervals on which the objective admits a closed-form minimizer. We extend PiSO to group-wise quantization via principled heuristics and propose effective strategies for interleaving scale optimization with error correction. Experiments on Llama and Qwen models across multiple model sizes and target weight bit-widths demonstrate consistent improvements in perplexity and downstream zero-shot accuracy, both standalone and combined with error correction. In particular, we observe increased benefits as the target bit-width narrows and quantization becomes more challenging.

0 Citations
0 Influential
2 Altmetric
10.0 Score
Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

Log in to request an AI analysis.

댓글

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!