2604.05663v1 Apr 07, 2026 cs.AI

CuraLight: LLM 기반 교통 신호 제어를 위한 토론 기반 데이터 큐레이션

CuraLight: Debate-Guided Data Curation for LLM-Centered Traffic Signal Control

Shengzhe Xu

Citations: 214

h-index: 6

Qing Guo

Citations: 2

h-index: 1

Xinhang Li

Citations: 138

h-index: 7

Junyu Chen

Citations: 2

h-index: 1

Zheng Guo

Citations: 2

h-index: 1

Lin Zhang

Citations: 2

h-index: 1

Lei Li

Citations: 30

h-index: 2

교통 신호 제어(TSC)는 교통 체증, 배출량 및 통행 시간을 줄이기 위한 지능형 교통 시스템(ITS)의 핵심 구성 요소입니다. 최근 강화 학습(RL) 및 대규모 언어 모델(LLM)을 기반으로 한 접근 방식은 적응성을 향상시켰지만, 여전히 해석 가능성이 제한되고, 상호 작용 데이터가 부족하며, 다양한 교차로에 대한 일반화 성능이 약한 문제가 있습니다. 본 논문에서는 RL 에이전트가 LLM 기반 교통 신호 제어기를 미세 조정하는 LLM 중심 프레임워크인 CuraLight를 제안합니다. RL 에이전트는 교통 환경을 탐색하고 고품질의 상호 작용 경로를 생성하며, 이는 모방 학습을 위한 프롬프트-응답 쌍으로 변환됩니다. 또한, 다중 LLM 앙상블 토론 시스템은 구조화된 토론을 통해 후보 신호 타이밍 동작을 평가하고, 훈련을 위한 선호도 기반의 감독 신호를 제공합니다. SUMO 시뮬레이터를 사용하여 진안, 항저우, 이좡의 다양한 실제 네트워크에서 수행한 실험 결과, CuraLight는 최첨단 기준 모델보다 우수한 성능을 지속적으로 보여주었으며, 평균 통행 시간은 5.34%, 평균 대기열 길이는 5.14%, 평균 대기 시간은 7.02% 감소했습니다. 이러한 결과는 확장 가능하고 해석 가능한 교통 신호 제어를 위해 RL 기반 탐색과 토론 기반 데이터 큐레이션을 결합하는 것이 효과적임을 보여줍니다.

Original Abstract

Traffic signal control (TSC) is a core component of intelligent transportation systems (ITS), aiming to reduce congestion, emissions, and travel time. Recent approaches based on reinforcement learning (RL) and large language models (LLMs) have improved adaptivity, but still suffer from limited interpretability, insufficient interaction data, and weak generalization to heterogeneous intersections. This paper proposes CuraLight, an LLM-centered framework where an RL agent assists the fine-tuning of an LLM-based traffic signal controller. The RL agent explores traffic environments and generates high-quality interaction trajectories, which are converted into prompt-response pairs for imitation fine-tuning. A multi-LLM ensemble deliberation system further evaluates candidate signal timing actions through structured debate, providing preference-aware supervision signals for training. Experiments conducted in SUMO across heterogeneous real-world networks from Jinan, Hangzhou, and Yizhuang demonstrate that CuraLight consistently outperforms state-of-the-art baselines, reducing average travel time by 5.34 percent, average queue length by 5.14 percent, and average waiting time by 7.02 percent. The results highlight the effectiveness of combining RL-assisted exploration with deliberation-based data curation for scalable and interpretable traffic signal control.

0 Citations

0 Influential

3.5 Altmetric

17.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!