2606.17016v1 Jun 15, 2026 cs.CL

TokenPilot: Cache-Efficient Context Management for LLM Agents

Xuehai Wang

Citations: 7

h-index: 2

Ning Zhang

Citations: 239

h-index: 6

Yunzhi Yao

Zhejiang University;Shandong University

Citations: 3,270

h-index: 22

Xinle Deng

Citations: 141

h-index: 6

Buqiang Xu

Citations: 15

h-index: 2

Jizhan Fang

Citations: 124

h-index: 4

Chiyu Wu

Citations: 3,270

h-index: 4

Z. Xue

Citations: 17

h-index: 2

Dian Chen

Citations: 25

h-index: 4

C. Fu

Citations: 9

h-index: 1

Caiying Huang

Citations: 0

h-index: 0

Chenyu Jiang

Citations: 10

h-index: 1

Yijun Chen

Citations: 37

h-index: 2

Jingbo Shang

Citations: 32

h-index: 2

Gong Yu

Citations: 0

h-index: 0

As LLM agents are deployed in long-horizon sessions, context accumulation drives up inference costs. Existing approaches utilize text pruning or dynamic memory eviction to minimize token footprints; however, their unconstrained sequence mutations alter layouts, introducing prefix mismatches and cache invalidation. This reveals a critical trade-off between text sparsity and prompt cache continuity. To address this, we present TokenPilot, a dual-granularity context management framework. Globally, Ingestion-Aware Compaction acts as a framework harness to stabilize prompt prefixes and eliminate open-world environmental noise at the ingestion gate. Locally, Lifecycle-Aware Eviction monitors the ongoing residual utility of context segments, enforcing a conservative batch-turn schedule to offload content segments only when task relevance expires. Experiments on PinchBench and Claw-Eval under both isolated and continuous modes demonstrate that TokenPilot reduces costs by 61% and 56% in isolated mode, and 61% and 87% in continuous mode, while maintaining competitive performance compared to prior systems. TokenPilot has been integrated into LightMem2 at https://github.com/zjunlp/LightMem2.

0 Citations

0 Influential

46.89026915174 Altmetric

234.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!