2604.04247v1 Apr 05, 2026 cs.AI

Combee: 프롬프트 학습 확장을 통한 자기 개선 언어 모델 에이전트

Combee: Scaling Prompt Learning for Self-Improving Language Model Agents

Alvin Cheung

Citations: 198

h-index: 6

Joseph E. Gonzalez

Citations: 4,124

h-index: 19

Qiuyang Mang

Citations: 83

h-index: 5

Ion Stoica

Citations: 1,770

h-index: 8

Eric Yang

Citations: 23

h-index: 3

K. Olukotun

Citations: 19,006

h-index: 69

Xiaokun Chen

Citations: 100

h-index: 3

James Zou

Citations: 227

h-index: 6

Hanchen Li

Citations: 174

h-index: 4

Runyuan He

Citations: 60

h-index: 3

Qizheng Zhang

Stanford University

Citations: 1,233

h-index: 12

Changxiu Ji

Citations: 0

h-index: 0

Lakshya A. Agrawal

Citations: 475

h-index: 4

Weiting Liao

Citations: 23

h-index: 2

최근 프롬프트 학습 기술의 발전으로 인해, 대규모 언어 모델 에이전트는 추론 시 문맥 정보를 활용하여 작업 관련 지식을 획득할 수 있으며, 파라미터 변경 없이 성능을 향상시킬 수 있습니다. 예를 들어, 기존 방법(ACE 또는 GEPA)은 이전 에이전트 실행 결과를 바탕으로 시스템 프롬프트를 학습하여 정확도를 향상시킬 수 있습니다. 그러나 이러한 방법은 주로 단일 에이전트 또는 낮은 병렬 처리 환경에 초점을 맞추고 있습니다. 이는 수집된 많은 에이전트 실행 기록으로부터 효율적으로 학습하는 능력을 근본적으로 제한합니다. 많은 에이전트 실행 기록 또는 병렬 에이전트 실행으로부터 학습하는 추세가 증가함에 따라, 병렬 프롬프트 학습을 통해 효율성과 이점을 얻을 수 있습니다. 그러나 체계적인 확장 전략 없이는 현재 방법은 높은 병렬 처리 수준에서 품질 저하를 겪습니다. 본 논문에서는 효율성과 품질을 향상시키기 위해, 자기 개선 에이전트를 위한 병렬 프롬프트 학습 확장을 가능하게 하는 새로운 프레임워크인 Combee를 제안합니다. Combee는 학습 속도를 향상시키고, 품질 저하 없이 에이전트의 집계된 실행 기록으로부터 학습하면서 많은 에이전트를 동시에 실행할 수 있도록 합니다. 이를 위해 Combee는 병렬 스캔을 활용하고, 확장된 셔플 메커니즘을 사용하며, 품질과 지연 시간을 균형 있게 조절하는 동적 배치 크기 제어기를 도입합니다. AppWorld, Terminal-Bench, Formula, FiNER 데이터셋에 대한 실험 결과, Combee는 이전 방법보다 최대 17배 빠른 속도로 학습하며, 동등하거나 더 나은 정확도와 동일한 비용으로 성능을 달성했습니다.

Original Abstract

Recent advances in prompt learning allow large language model agents to acquire task-relevant knowledge from inference-time context without parameter changes. For example, existing methods (like ACE or GEPA) can learn system prompts to improve accuracy based on previous agent runs. However, these methods primarily focus on single-agent or low-parallelism settings. This fundamentally limits their ability to efficiently learn from a large set of collected agentic traces. It would be efficient and beneficial to run prompt learning in parallel to accommodate the growing trend of learning from many agentic traces or parallel agent executions. Yet without a principled strategy for scaling, current methods suffer from quality degradation with high parallelism. To improve both the efficiency and quality of prompt learning, we propose Combee, a novel framework to scale parallel prompt learning for self-improving agents. Combee speeds up learning and enables running many agents in parallel while learning from their aggregate traces without quality degradation. To achieve this, Combee leverages parallel scans and employs an augmented shuffle mechanism; Combee also introduces a dynamic batch size controller to balance quality and delay. Evaluations on AppWorld, Terminal-Bench, Formula, and FiNER demonstrate that Combee achieves up to 17x speedup over previous methods with comparable or better accuracy and equivalent cost.

0 Citations

0 Influential

30 Altmetric

150.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!