2604.04324v1 Apr 06, 2026 cs.AI

RESCORE: LLM 기반 시뮬레이션 재현을 통한 제어 시스템 연구 논문 분석

RESCORE: LLM-Driven Simulation Recovery in Control Systems Research Papers

P. Krishnamurthy

Citations: 4,262

h-index: 33

F. Khorrami

Citations: 6,433

h-index: 41

Ramesh Karri

Citations: 336

h-index: 9

V. Bhat

Citations: 102

h-index: 4

Shiqing Wei

Citations: 39

h-index: 3

Ali Umut Kaypak

Citations: 59

h-index: 3

제어 시스템 연구 논문에서 수치 시뮬레이션을 재구성하는 것은 종종 명확하게 정의되지 않은 파라미터와 모호한 구현 세부 사항으로 인해 어려움을 겪습니다. 본 연구에서는 '논문-시뮬레이션 재현 가능성'이라는 개념을 정의하며, 이는 자동화된 시스템이 논문의 결과를 충실하게 재현하는 실행 가능한 코드를 생성하는 능력입니다. IEEE Decision and Control (CDC) 학회에서 발표된 500편의 논문으로 구성된 벤치마크를 구축하고, RESCORE라는 세 가지 구성 요소(분석기, 코더, 검증기)로 이루어진 LLM 기반 에이전트 프레임워크를 제안합니다. RESCORE는 반복적인 실행 피드백과 시각적 비교를 사용하여 재현 정확도를 향상시킵니다. 본 방법은 벤치마크의 40.7%에 해당하는 시뮬레이션을 성공적으로 재현했으며, 이는 단일 단계 생성 방식보다 뛰어난 성능입니다. 특히, RESCORE 자동화 파이프라인은 수동으로 수행하는 것보다 약 10배 빠른 속도를 제공하여, 발표된 제어 방법론을 검증하는 데 필요한 시간과 노력을 크게 줄입니다. 본 연구에서는 벤치마크와 에이전트를 공개하여 연구 재현 자동화 분야의 발전을 촉진하고자 합니다.

Original Abstract

Reconstructing numerical simulations from control systems research papers is often hindered by underspecified parameters and ambiguous implementation details. We define the task of Paper to Simulation Recoverability, the ability of an automated system to generate executable code that faithfully reproduces a paper's results. We curate a benchmark of 500 papers from the IEEE Conference on Decision and Control (CDC) and propose RESCORE, a three component LLM agentic framework, Analyzer, Coder, and Verifier. RESCORE uses iterative execution feedback and visual comparison to improve reconstruction fidelity. Our method successfully recovers task coherent simulations for 40.7% of benchmark instances, outperforming single pass generation. Notably, the RESCORE automated pipeline achieves an estimated 10X speedup over manual human replication, drastically cutting the time and effort required to verify published control methodologies. We will release our benchmark and agents to foster community progress in automated research replication.

0 Citations

0 Influential

20.5 Altmetric

102.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!