2603.01048v1 Mar 01, 2026 cs.SE

RepoRepair: 코드 문서 활용을 통한 저장소 수준의 자동 프로그램 수정

RepoRepair: Leveraging Code Documentation for Repository-Level Automated Program Repair

Zhongqiang Pan

Citations: 25

h-index: 2

Wenkang Zhong

Citations: 78

h-index: 4

Yile Feng

Citations: 5

h-index: 1

Bin Luo

Citations: 177

h-index: 7

Vincent Ng

Citations: 578

h-index: 10

Chuanyi Li

Citations: 24

h-index: 3

자동 프로그램 수정(APR)은 독립적인 함수 수준에서 전체 저장소로 확장하는 데 어려움을 겪습니다. 이는 필요한 변경 사항을 찾기 위해 전역적이고 작업에 대한 이해가 필요하기 때문입니다. 현재 방법은 컨텍스트의 제한과 피상적인 검색 또는 비용이 많이 드는 에이전트 반복에 의존하기 때문에 복잡한 파일 간 문제에 취약합니다. 이에, 우리는 저장소 수준의 오류 위치 파악 및 프로그램 수정을 위한 새로운 문서 기반 접근 방식인 RepoRepair를 제안합니다. 우리의 핵심 아이디어는 LLM을 사용하여 함수에서 파일까지 계층적인 코드 문서를 코드 저장소에 생성하는 것입니다. 이를 통해 LLM이 저장소 수준의 컨텍스트와 종속성을 이해할 수 있는 구조화된 의미 추상화를 제공합니다. 구체적으로, RepoRepair는 먼저 텍스트 기반 LLM(예: DeepSeek-V3)을 사용하여 저장소의 파일/함수 수준 코드 문서를 생성합니다. 이는 오류 위치 파악을 안내하는 보조 지식으로 사용됩니다. 이후, 오류 위치 파악 결과와 문제 설명을 기반으로 강력한 LLM(예: Claude-4)이 식별된 의심스러운 코드 조각을 수정하려고 시도합니다. SWE-bench Lite에서 평가한 결과, RepoRepair는 45.7%의 수정률을 달성했으며, 수정당 비용은 0.44달러로 매우 저렴합니다. SWE-bench Multimodal에서는 수정당 비용이 0.56달러로 더 높지만, 37.1%의 수정률로 최첨단 성능을 보여주며, 다양한 문제 영역에서 견고하고 비용 효율적인 성능을 입증합니다.

Original Abstract

Automated program repair (APR) struggles to scale from isolated functions to full repositories, as it demands a global, task-aware understanding to locate necessary changes. Current methods, limited by context and reliant on shallow retrieval or costly agent iterations, falter on complex cross-file issues. To this end, we propose RepoRepair, a novel documentation-enhanced approach for repository-level fault localization and program repair. Our core insight is to leverage LLMs to generate hierarchical code documentation (from functions to files) for code repositories, creating structured semantic abstractions that enable LLMs to comprehend repository-level context and dependencies. Specifically, RepoRepair first employs a text-based LLM (e.g., DeepSeek-V3) to generate file/function-level code documentation for repositories, which serves as auxiliary knowledge to guide fault localization. Subsequently, based on the fault localization results and the issue description, a powerful LLM (e.g., Claude-4) attempts to repair the identified suspicious code snippets. Evaluated on SWE-bench Lite, RepoRepair achieves a 45.7% repair rate at a low cost of $0.44 per fix. On SWE-bench Multimodal, it delivers state-of-the-art performance with a 37.1% repair rate despite a higher cost of $0.56 per fix, demonstrating robust and cost-effective performance across diverse problem domains.

4 Citations

0 Influential

5 Altmetric

29.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!