2602.05965v1 Feb 05, 2026 cs.MA

공유 학습: 효율적인 병렬 에이전트 시스템을 위한 선택적 메모리

Learning to Share: Selective Memory for Efficient Parallel Agentic Systems

Joseph Fioresi

Citations: 86

h-index: 5

P. Kulkarni

Citations: 107

h-index: 4

Ashmal Vayani

Citations: 217

h-index: 7

Song Wang

Citations: 46

h-index: 3

Mubarak Shah

Citations: 48

h-index: 4

에이전트 시스템은 여러 에이전트가 반복적으로 추론하고, 도구를 사용하며, 중간 결과를 교환하여 복잡한 작업을 해결합니다. 최근 접근 방식은 견고성과 솔루션 품질을 향상시키기 위해 다양한 추론 경로를 탐색하기 위해 여러 에이전트 팀을 병렬로 실행합니다. 그러나 병렬 실행은 상당한 계산 비용을 초래합니다. 서로 다른 팀이 독립적으로 유사한 하위 문제를 추론하거나 유사한 단계를 실행할 때, 중복되는 계산이 반복적으로 수행됩니다. 이러한 제한 사항을 해결하기 위해, 본 논문에서는 병렬 에이전트 프레임워크를 위한 학습 기반 공유 메모리 메커니즘인 Learning to Share (LTS)를 제안합니다. LTS는 모든 팀에서 접근 가능한 글로벌 메모리 뱅크와, 중간 에이전트 단계를 메모리에 추가할지 여부를 결정하는 경량 컨트롤러를 도입합니다. 컨트롤러는 사용량 인지 신용 할당을 통해 단계별 강화 학습을 사용하여 훈련되며, 이를 통해 병렬 실행 전반에 걸쳐 전역적으로 유용한 정보를 식별할 수 있습니다. AssistantBench 및 GAIA 벤치마크에서 수행된 실험 결과, LTS는 전체 실행 시간을 크게 줄이는 동시에 메모리 없는 병렬 기준 모델과 동등하거나 더 나은 작업 성능을 달성했습니다. 이는 학습 기반 메모리 접근 제어가 병렬 에이전트 시스템의 효율성을 향상시키는 효과적인 전략임을 보여줍니다. 프로젝트 페이지: https://joefioresi718.github.io/LTS_webpage/

Original Abstract

Agentic systems solve complex tasks by coordinating multiple agents that iteratively reason, invoke tools, and exchange intermediate results. To improve robustness and solution quality, recent approaches deploy multiple agent teams running in parallel to explore diverse reasoning trajectories. However, parallel execution comes at a significant computational cost: when different teams independently reason about similar sub-problems or execute analogous steps, they repeatedly perform substantial overlapping computation. To address these limitations, in this paper, we propose Learning to Share (LTS), a learned shared-memory mechanism for parallel agentic frameworks that enables selective cross-team information reuse while controlling context growth. LTS introduces a global memory bank accessible to all teams and a lightweight controller that decides whether intermediate agent steps should be added to memory or not. The controller is trained using stepwise reinforcement learning with usage-aware credit assignment, allowing it to identify information that is globally useful across parallel executions. Experiments on the AssistantBench and GAIA benchmarks show that LTS significantly reduces overall runtime while matching or improving task performance compared to memory-free parallel baselines, demonstrating that learned memory admission is an effective strategy for improving the efficiency of parallel agentic systems. Project page: https://joefioresi718.github.io/LTS_webpage/

1 Citations

0 Influential

3.5 Altmetric

18.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!