2602.01227v2 Feb 01, 2026 cs.CL

토큰 우선순위의 잠재력을 실현하기 위한 지도 미세 조정의 필요성

Supervised Fine-Tuning Needs to Unlock the Potential of Token Priority

Wen-song Ye

Citations: 259

h-index: 7

Zeyu Qin

Citations: 158

h-index: 3

Zhanming Shen

Citations: 54

h-index: 2

Jiaqi Hu

Citations: 182

h-index: 5

Hao Chen

Citations: 1

h-index: 1

Xiaomeng Hu

Citations: 137

h-index: 4

Haokai Xu

Citations: 142

h-index: 4

Gang Chen

Citations: 531

h-index: 10

Yi R. Fung

Citations: 72

h-index: 3

Haobo Wang

Citations: 602

h-index: 7

경험적 데이터에 적합하는 것에서 진정한 인간 유용성을 달성하는 것으로의 전환은 근본적으로 세분화 수준의 불일치에 의해 제약됩니다. 여기서 세밀한 자동 회귀 생성이 종종 거칠거나 균일한 신호에 의해 지도됩니다. 본 논문은 토큰 우선순위를 필수적인 연결고리로 제시하며, 지도 미세 조정(SFT)을 단순한 최적화가 아닌, 원시 데이터를 이상적인 정렬 다양체에 맞추는 정교한 분포 재구성 프로세스로 공식화합니다. 우리는 이 통합된 관점에서 최근의 획기적인 발전들을 분석하고, 이를 양극성 우선순위(노이즈 제거)와 부호화된 우선순위(유해 콘텐츠 제거)라는 두 가지 구별되는 영역으로 분류합니다. 우리는 기존의 발전 및 한계를 재검토하고, 주요 과제를 파악하며, 향후 연구를 위한 방향을 제시합니다.

Original Abstract

The transition from fitting empirical data to achieving true human utility is fundamentally constrained by a granularity mismatch, where fine-grained autoregressive generation is often supervised by coarse or uniform signals. This position paper advocates Token Priority as the essential bridge, formalizing Supervised Fine-Tuning (SFT) not as simple optimization but as a precise distribution reshaping process that aligns raw data with the ideal alignment manifold. We analyze recent breakthroughs through this unified lens, categorizing them into two distinct regimes: Positive Priority for noise filtration and Signed Priority for toxic modes unlearning. We revisit existing progress and limitations, identify key challenges, and suggest directions for future research.

0 Citations

0 Influential

5 Altmetric

25.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!