2602.07422v1 Feb 07, 2026 cs.CR

취약점 보상 모델을 활용한 온라인 강화 학습 기반 안전한 코드 생성

Secure Code Generation via Online Reinforcement Learning with Vulnerability Reward Model

Jiaheng Zhang

Citations: 229

h-index: 6

Tianyi Wu

Citations: 6

h-index: 1

Mingzhe Du

Citations: 71

h-index: 4

Yue Liu

Citations: 11

h-index: 1

Cheng-Lin Yang

Citations: 21

h-index: 2

T. Zhuo

Citations: 9

h-index: 2

See-Kiong Ng

Citations: 37

h-index: 4

대규모 언어 모델(LLM)은 소프트웨어 개발에 점점 더 많이 사용되고 있지만, 이러한 모델들이 안전하지 않은 코드를 생성하는 경향은 실제 적용에 있어 주요 장애물로 작용합니다. 기존의 안전한 코드 정렬 방법은 종종 기능성과 보안 사이의 상충 관계에 직면하며, 보안을 향상시키는 대신 상당한 유용성 저하를 초래합니다. 본 연구에서는 기능성을 유지하면서 안전한 코드를 생성하는 온라인 강화 학습 프레임워크인 SecCoderX를 제안합니다. SecCoderX는 먼저 성숙한 탐지 리소스를 활용하여 취약점 탐지와 안전한 코드 생성을 연결합니다. 구체적으로, (i) 온라인 강화 학습 시뮬레이션을 위한 다양한 실제 기반의 취약점 유발 코딩 작업을 생성하고, (ii) 확장 가능하고 신뢰할 수 있는 보안 감독을 제공하는 추론 기반의 취약점 보상 모델을 학습합니다. 이러한 구성 요소들은 온라인 강화 학습 루프 내에서 통합되어 코드 LLM을 정렬하여 안전하고 기능적인 코드를 생성하도록 합니다. 광범위한 실험 결과, SecCoderX는 최고 수준의 성능을 달성했으며, 비정렬 모델에 비해 Effective Safety Rate (ESR)를 약 10% 향상시켰습니다. 반면, 기존 방법은 ESR을 14-54%까지 저하시키는 경우가 많습니다. 저희는 코드, 데이터셋 및 모델 체크포인트를 https://github.com/AndrewWTY/SecCoderX 에서 공개합니다.

Original Abstract

Large language models (LLMs) are increasingly used in software development, yet their tendency to generate insecure code remains a major barrier to real-world deployment. Existing secure code alignment methods often suffer from a functionality--security paradox, improving security at the cost of substantial utility degradation. We propose SecCoderX, an online reinforcement learning framework for functionality-preserving secure code generation. SecCoderX first bridges vulnerability detection and secure code generation by repurposing mature detection resources in two ways: (i) synthesizing diverse, reality-grounded vulnerability-inducing coding tasks for online RL rollouts, and (ii) training a reasoning-based vulnerability reward model that provides scalable and reliable security supervision. Together, these components are unified in an online RL loop to align code LLMs to generate secure and functional code. Extensive experiments demonstrate that SecCoderX achieves state-of-the-art performance, improving Effective Safety Rate (ESR) by approximately 10% over unaligned models, whereas prior methods often degrade ESR by 14-54%. We release our code, dataset and model checkpoints at https://github.com/AndrewWTY/SecCoderX.

0 Citations

0 Influential

41.688348091417 Altmetric

208.4 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!