2604.11716v1 Apr 13, 2026 cs.AI

SWE-AGILE: 효율적인 동적 추론 컨텍스트 관리를 위한 소프트웨어 에이전트 프레임워크

SWE-AGILE: A Software Agent Framework for Efficiently Managing Dynamic Reasoning Context

Juncheng Liu

Citations: 28

h-index: 2

Shuquan Lian

Citations: 41

h-index: 3

Yazhe Chen

Citations: 7

h-index: 1

Yuhong Chen

Citations: 20

h-index: 2

Hui Li

Citations: 12

h-index: 3

자율 소프트웨어 엔지니어링(SWE) 분야의 기존 ReAct 방식 접근 방식은 일반적으로 심층 분석 및 복잡한 예외 처리에 필요한 명시적인 시스템-2 추론 능력이 부족합니다. 최근 추론 모델들은 확장된 체인-오브-생트(Chain-of-Thought, CoT)의 잠재력을 보여주지만, 이를 다중 단계의 SWE 작업에 적용하면 근본적인 딜레마가 발생합니다. 즉, 완전한 추론 기록을 유지하면 컨텍스트 폭증 및 '중간에서 사라짐(Lost-in-the-Middle)' 현상이 발생하고, 반대로 기록을 버리면 에이전트는 매 단계마다 불필요하게 추론을 반복해야 합니다. 이러한 문제점을 해결하기 위해, 우리는 추론 깊이, 효율성 및 컨텍스트 제약 간의 균형을 맞추도록 설계된 새로운 소프트웨어 에이전트 프레임워크인 SWE-AGILE을 제안합니다. SWE-AGILE은 동적 추론 컨텍스트 전략을 도입하여, 즉각적인 연속성을 유지하고 불필요한 재분석을 방지하기 위해 상세한 추론을 포함하는 '슬라이딩 윈도우'를 유지하는 동시에, 과거 추론 내용을 간결한 추론 요약(Reasoning Digests)으로 압축합니다. 실험 결과, SWE-AGILE은 2.2k개의 트레이저리와 896개의 작업만을 사용하여 SWE-Bench-Verified 데이터셋에서 7B-8B 모델에 대한 새로운 성능 기준을 설정했습니다. 코드 정보는 다음 주소에서 확인할 수 있습니다: https://github.com/KDEGroup/SWE-AGILE.

Original Abstract

Prior representative ReAct-style approaches in autonomous Software Engineering (SWE) typically lack the explicit System-2 reasoning required for deep analysis and handling complex edge cases. While recent reasoning models demonstrate the potential of extended Chain-of-Thought (CoT), applying them to the multi-turn SWE task creates a fundamental dilemma: retaining full reasoning history leads to context explosion and ``Lost-in-the-Middle'' degradation, while discarding it would force the agent to redundantly re-reason at every step. To address these challenges, we propose SWE-AGILE, a novel software agent framework designed to bridge the gap between reasoning depth, efficiency, and context constraints. SWE-AGILE introduces a Dynamic Reasoning Context strategy, maintaining a ``sliding window'' of detailed reasoning for immediate continuity to prevent redundant re-analyzing, while compressing historical reasoning content into concise Reasoning Digests. Empirically, SWE-AGILE sets a new standard for 7B-8B models on SWE-Bench-Verified using only 2.2k trajectories and 896 tasks. Code is available at https://github.com/KDEGroup/SWE-AGILE.

6 Citations

0 Influential

33.489476363992 Altmetric

173.4 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!