2603.17826v1 Mar 18, 2026 cs.SE

FailureMem: 자율 소프트웨어 복구를 위한 오류 인지형 다중 모드 프레임워크

FailureMem: A Failure-Aware Multimodal Framework for Autonomous Software Repair

Yilei Jiang

Citations: 215

h-index: 9

Lewei Lu

Citations: 3

h-index: 1

Yile Feng

Citations: 5

h-index: 1

Vincent Ng

Citations: 578

h-index: 10

Chuanyi Li

Citations: 24

h-index: 3

Shilin Zhang

Citations: 185

h-index: 4

Zheng Ma

Citations: 1,219

h-index: 3

Xiangyu Yue

Citations: 180

h-index: 9

Ruize Ma

Citations: 54

h-index: 6

Zhi Wang

Citations: 666

h-index: 5

다중 모드 자동 프로그램 복구(MAPR)는 기존의 프로그램 복구 방식을 확장하여 모델이 소스 코드, 텍스트 기반 문제 설명, GUI 스크린샷과 같은 시각적 정보를 함께 분석하도록 요구합니다. 최근의 LLM 기반 복구 시스템은 유망한 결과를 보여주었지만, 기존 접근 방식은 다음과 같은 몇 가지 한계를 가지고 있습니다. 경직된 워크플로우 파이프라인은 디버깅 과정에서의 탐색을 제한하고, 시각적 추론은 종종 지역화된 정보 없이 전체 페이지 스크린샷을 대상으로 수행되며, 실패한 복구 시도는 거의 재사용 가능한 지식으로 전환되지 않습니다. 이러한 문제점을 해결하기 위해, 우리는 세 가지 핵심 메커니즘을 통합한 다중 모드 복구 프레임워크인 FailureMem을 제안합니다. FailureMem은 구조화된 지역화와 유연한 추론을 균형 있게 조화시키는 하이브리드 워크플로우-에이전트 아키텍처, 영역 수준의 시각적 정보를 획득할 수 있도록 하는 능동적 인식 도구, 그리고 과거 복구 시도를 재사용 가능한 지침으로 변환하는 Failure Memory Bank를 포함합니다. SWE-bench Multimodal 데이터셋에 대한 실험 결과, FailureMem은 GUIRepair에 비해 복구 성공률을 3.7% 향상시키는 것으로 나타났습니다.

Original Abstract

Multimodal Automated Program Repair (MAPR) extends traditional program repair by requiring models to jointly reason over source code, textual issue descriptions, and visual artifacts such as GUI screenshots. While recent LLM-based repair systems have shown promising results, existing approaches face several limitations: rigid workflow pipelines restrict exploration during debugging, visual reasoning is often performed over full-page screenshots without localized grounding, and failed repair attempts are rarely transformed into reusable knowledge. To address these challenges, we propose FailureMem, a multimodal repair framework that integrates three key mechanisms: a hybrid workflow-agent architecture that balances structured localization with flexible reasoning, active perception tools that enable region-level visual grounding, and a Failure Memory Bank that converts past repair attempts into reusable guidance. Experiments on SWE-bench Multimodal demonstrate FailureMem improves the resolved rate over GUIRepair by 3.7%.

0 Citations

0 Influential

5 Altmetric

25.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!