2603.18897v1 Mar 19, 2026 cs.DC

생각하면서 행동하기: 패턴 인식 기반 추론을 통한 LLM 에이전트 성능 가속화

Act While Thinking: Accelerating LLM Agents via Pattern-Aware Speculative Tool Execution

Rui Ma

Citations: 208

h-index: 5

Yifan Sui

Citations: 44

h-index: 3

Han Zhao

Citations: 210

h-index: 8

Zhiyuan He

Citations: 299

h-index: 7

Hao Wang

Citations: 18

h-index: 3

Jianxun Li

Citations: 36

h-index: 3

Yuqing Yang

Citations: 38

h-index: 3

Kai Chen

Citations: 5

h-index: 1

Kaiqiang Xu

Citations: 28

h-index: 2

LLM 기반 에이전트는 자율적인 작업 수행을 위한 주요 패러다임으로 부상하고 있습니다. 기존 추론 작업과는 달리, 에이전트는 엄격하게 직렬화된 "LLM-도구" 루프 방식으로 작동하며, LLM은 매 단계마다 외부 도구 실행을 기다려야 합니다. 이러한 실행 모델은 심각한 지연 병목 현상을 초래합니다. 이러한 문제를 해결하기 위해, 우리는 도구 지연 시간을 예측을 통해 숨기는 패턴 인식 기반 추론 도구 실행 방법인 PASTE를 제안합니다. PASTE는 에이전트 요청이 의미적으로 다양하지만, 애플리케이션 수준의 안정적인 제어 흐름(반복되는 도구 호출 시퀀스)과 예측 가능한 데이터 의존성(도구 간의 매개변수 전달)을 나타낸다는 점에 착안하여 개발되었습니다. PASTE는 이러한 특징을 활용하여 추론 도구 실행을 예측적으로 수행함으로써 에이전트 서비스 성능을 향상시킵니다. 최첨단 기준 모델과의 실험 결과, PASTE는 평균 작업 완료 시간을 48.5% 단축하고 도구 실행 처리량을 1.8배 향상시키는 것으로 나타났습니다.

Original Abstract

LLM-powered agents are emerging as a dominant paradigm for autonomous task solving. Unlike standard inference workloads, agents operate in a strictly serial "LLM-tool" loop, where the LLM must wait for external tool execution at every step. This execution model introduces severe latency bottlenecks. To address this problem, we propose PASTE, a Pattern-Aware Speculative Tool Execution method designed to hide tool latency through speculation. PASTE is based on the insight that although agent requests are semantically diverse, they exhibit stable application level control flows (recurring tool-call sequences) and predictable data dependencies (parameter passing between tools). By exploiting these properties, PASTE improves agent serving performance through speculative tool execution. Experimental results against state of the art baselines show that PASTE reduces average task completion time by 48.5% and improves tool execution throughput by 1.8x.

8 Citations

2 Influential

4 Altmetric

32.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!