2601.05858v1 Jan 09, 2026 cs.CL

CLewR: 재시작을 활용한 교육 과정 학습을 통한 기계 번역 선호도 학습

CLewR: Curriculum Learning with Restarts for Machine Translation Preference Learning

R. Ionescu

Citations: 289

h-index: 11

Alexandra Dragomir

Citations: 1

h-index: 1

Florin Brad

Citations: 231

h-index: 9

대규모 언어 모델(LLM)은 제로샷 다국어 기계 번역(MT)에서 뛰어난 성능을 보여주었습니다. 일부 후속 연구에서는 선호도 최적화를 통해 MT 성능을 더욱 향상시켰지만, 훈련 과정에서 데이터 샘플이 제시되는 순서와 같은 중요한 측면은 아직 충분히 연구되지 않았습니다. 본 연구에서는 교육 과정 학습을 다양한 최첨단 선호도 최적화 알고리즘에 통합하여 MT 성능을 향상시킵니다. 우리는 재시작을 활용한 새로운 교육 과정 학습 전략(CLewR)을 제안합니다. CLewR은 훈련 과정에서 쉬운 예제부터 어려운 예제 순서로 구성된 교육 과정을 여러 번 반복하여, 쉬운 예제의 파국적인 소실을 효과적으로 완화합니다. 우리는 다양한 모델 패밀리(Gemma2, Qwen2.5, Llama3.1)와 선호도 최적화 기법에 걸쳐 일관된 성능 향상을 보였습니다. 저희 코드를 다음 GitHub 주소에서 공개합니다: https://github.com/alexandra-dragomir/CLewR.

Original Abstract

Large language models (LLMs) have demonstrated competitive performance in zero-shot multilingual machine translation (MT). Some follow-up works further improved MT performance via preference optimization, but they leave a key aspect largely underexplored: the order in which data samples are given during training. We address this topic by integrating curriculum learning into various state-of-the-art preference optimization algorithms to boost MT performance. We introduce a novel curriculum learning strategy with restarts (CLewR), which reiterates easy-to-hard curriculum multiple times during training to effectively mitigate the catastrophic forgetting of easy examples. We demonstrate consistent gains across several model families (Gemma2, Qwen2.5, Llama3.1) and preference optimization techniques. We publicly release our code at https://github.com/alexandra-dragomir/CLewR.

1 Citations

0 Influential

25.5 Altmetric

128.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!