2602.11164v1 Jan 17, 2026 cs.LG

로컬화 가능한 오류 기반 관점을 통한 자동 최적화 모델링

Automated Optimization Modeling via a Localizable Error-Driven Perspective

Tao Zhong

Citations: 105

h-index: 5

Weiting Liu

Citations: 123

h-index: 5

Han Wu

Citations: 161

h-index: 7

Yufei Kuang

Citations: 296

h-index: 11

Xiongwei Han

Citations: 414

h-index: 12

Jianfeng Feng

Citations: 298

h-index: 7

Wenlian Lu

Citations: 228

h-index: 6

대규모 언어 모델(LLM)을 활용한 자동 최적화 모델링은 복잡한 인간 의사 결정을 지원하는 유망한 접근 방식으로 부상하고 있습니다. 사후 학습은 LLM의 능력을 향상시키는 중요한 기술이 되었지만, 고품질 학습 데이터의 부족과 활용 부족으로 인해 그 효과가 크게 제한됩니다. 본 연구에서는 사후 학습을 통해 얻은 다양한 문제-응답 쌍의 오류 패턴을 상세히 분석하여, 기존의 자동 최적화 모델링 접근 방식의 두 가지 근본적인 한계를 밝혀냈습니다. 첫째, 오류 유형별 문제의 희소성(L1)이며, 둘째는 어려운 문제와 관련된 희소한 보상(L2)입니다. 이러한 한계가 LLM의 도메인 특화 사후 학습에서 최적의 성능을 달성하는 데 장애가 될 수 있음을 보여줍니다. 위 두 가지 한계를 해결하기 위해, 본 연구에서는 데이터 생성부터 사후 학습에 이르기까지 전체 모델 학습 프레임워크를 맞춤화하는 새로운 오류 기반 학습 프레임워크인 auto extbf{m}ated opt extbf{i}mization modeli extbf{n}g via a localizable error- extbf{d}riven perspective (MIND)를 제안합니다. MIND는 최적화 모델링에서 발생하는 오류의 고유한 로컬화 패턴이라는 핵심 관찰에 기반합니다. 즉, 모델링 오류는 특정 의미 단위에 국한되는 경향이 있으며, 전체 솔루션으로 전파되지 않습니다. 따라서 MIND는 수학적 증명과 같은 전체적인 추론 작업과는 달리, 집중적이고 밀도가 높은 학습 코퍼스를 구축하고, 어려운 문제를 해결하기 위해 extbf{D}ynamic Supervised extbf{F}ine-Tuning extbf{P}olicy extbf{O}ptimization (DFPO)를 제안합니다. 여섯 가지 벤치마크에서의 실험 결과, MIND가 기존의 최첨단 자동 최적화 모델링 접근 방식보다 일관되게 우수한 성능을 발휘함을 확인했습니다.

Original Abstract

Automated optimization modeling via Large Language Models (LLMs) has emerged as a promising approach to assist complex human decision-making. While post-training has become a pivotal technique to enhance LLMs' capabilities in this domain, its effectiveness is severely constrained by the scarcity and underutilization of high-quality training data. However, through a detailed profiling of error patterns across various problem-response pairs drawn from post-training, we identify two fundamental limitations of existing automated optimization modeling approaches: (L1) the sparsity of error-specific problems and (L2) the sparse rewards associated with difficult problems. We demonstrate that these limitations can result in suboptimal performance in domain-specific post-training for LLMs. To tackle the above two limitations, we propose a novel error-driven learning framework -- namely, auto\textbf{m}ated opt\textbf{i}mization modeli\textbf{n}g via a localizable error-\textbf{d}riven perspective (MIND) -- that customizes the whole model training framework from data synthesis to post-training. MIND is based on our key observation of the unique localizable patterns in error propagation of optimization modelings, that is, modeling errors may remain localized to specific semantic segments and do not propagate throughout the entire solution. Thus, in contrast to holistic reasoning tasks such as mathematical proofs, MIND leverages the construction of a focused, high-density training corpus and proposes \textbf{D}ynamic Supervised \textbf{F}ine-Tuning \textbf{P}olicy \textbf{O}ptimization (DFPO) to tackle difficult problems through localized refinement. Experiments on six benchmarks demonstrate that MIND consistently outperforms all the state-of-the-art automated optimization modeling approaches.

13 Citations

0 Influential

6 Altmetric

43.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!