2601.04526v1 Jan 08, 2026 cs.SE

코드 관련 작업에 대한 언어 모델 발전

Advancing Language Models for Code-related Tasks

Citations: 161

h-index: 5

최근 언어 모델(LM)의 발전은 다양한 소프트웨어 엔지니어링 작업에서 상당한 진전을 가져왔습니다. 그러나 기존 LM은 데이터 품질, 모델 아키텍처 및 추론 능력의 한계로 인해 복잡한 프로그래밍 시나리오에서 어려움을 겪고 있습니다. 본 연구는 세 가지 상호 보완적인 방향을 통해 이러한 과제를 체계적으로 해결합니다. (1) 코드 차이 기반 적대적 증강 기법(CODA)과 코드 노이즈 제거 기법(CodeDenoise)을 통해 코드 데이터 품질을 향상시키고, (2) 구문 기반 코드 LM(LEAM 및 LEAM++)을 통해 모델 아키텍처를 개선하며, (3) 프롬프팅 기법(muFiX)과 에이전트 기반 기법(Specine)을 통해 모델의 추론 능력을 향상시킵니다. 이러한 기술들은 소프트웨어 개발에서의 LM의 실용적인 적용을 촉진하고 지능형 소프트웨어 엔지니어링을 더욱 발전시키는 것을 목표로 합니다.

Original Abstract

Recent advances in language models (LMs) have driven significant progress in various software engineering tasks. However, existing LMs still struggle with complex programming scenarios due to limitations in data quality, model architecture, and reasoning capability. This research systematically addresses these challenges through three complementary directions: (1) improving code data quality with a code difference-guided adversarial augmentation technique (CODA) and a code denoising technique (CodeDenoise); (2) enhancing model architecture via syntax-guided code LMs (LEAM and LEAM++); and (3) advancing model reasoning with a prompting technique (muFiX) and an agent-based technique (Specine). These techniques aim to promote the practical adoption of LMs in software development and further advance intelligent software engineering.

0 Citations

0 Influential

2.5 Altmetric

12.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!