2601.18588v2 Jan 26, 2026 cs.AI

안정성이 부채가 되다: LLM에서 발생하는 언어 구조의 체계적인 붕괴

Stability as a Liability:Systematic Breakdown of Linguistic Structure in LLMs

Lin Qi

Citations: 0

h-index: 0

Qing Yang

Citations: 209

h-index: 3

X. Meng

Citations: 181

h-index: 2

Ling Luo

Citations: 96

h-index: 6

J. Hao

Citations: 46

h-index: 2

Qinyu Wang

Citations: 25

h-index: 3

Ruiping Yin

Citations: 0

h-index: 0

Wenbo Wu

Citations: 5

h-index: 1

Qian Zeng

Citations: 9

h-index: 2

Renzhi Lu

Citations: 10

h-index: 2

대규모 언어 모델에서 안정적인 학습은 신뢰할 수 있는 최적화를 위한 필수 조건으로 간주됩니다. 본 연구에서는 학습 과정의 안정성이 생성 분포에 미치는 영향을 분석합니다. 표준 최대 가능성 학습 하에서, 안정적인 파라미터 경로는 정적 해를 약하게 유도하여 경험적 분포에 대한 순방향 KL 발산이 최소화되는 동시에, 생성 엔트로피가 암묵적으로 감소되는 것을 보여줍니다. 결과적으로, 학습된 모델은 확률 질량을 경험적 모드의 제한된 부분에 집중시켜, 부드러운 손실 수렴에도 불구하고 체계적인 성능 저하를 초래할 수 있습니다. 우리는 제어된 피드백 기반 학습 프레임워크를 사용하여 내부 생성 통계량을 안정화함으로써 이러한 효과를 실증적으로 검증했습니다. 실험 결과, 다양한 아키텍처 및 랜덤 시드에서 일관적으로 낮은 엔트로피의 출력과 반복적인 현상이 관찰되었습니다. 이는 최적화 안정성과 생성 표현력이 본질적으로 일치하지 않으며, 안정성만으로는 생성 품질을 충분히 나타내는 지표가 될 수 없음을 시사합니다.

Original Abstract

Training stability is typically regarded as a prerequisite for reliable optimization in large language models. In this work, we analyze how stabilizing training dynamics affects the induced generation distribution. We show that under standard maximum likelihood training, stable parameter trajectories lead stationary solutions to approximately minimize the forward KL divergence to the empirical distribution, while implicitly reducing generative entropy. As a consequence, the learned model can concentrate probability mass on a limited subset of empirical modes, exhibiting systematic degeneration despite smooth loss convergence. We empirically validate this effect using a controlled feedback-based training framework that stabilizes internal generation statistics, observing consistent low-entropy outputs and repetitive behavior across architectures and random seeds. It indicates that optimization stability and generative expressivity are not inherently aligned, and that stability alone is an insufficient indicator of generative quality.

0 Citations

0 Influential

3 Altmetric

15.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!