2606.17016v1 Jun 15, 2026 cs.CL

TokenPilot: Cache-Efficient Context Management for LLM Agents

Xuehai Wang
Xuehai Wang
Citations: 7
h-index: 2
Ning Zhang
Ning Zhang
Citations: 239
h-index: 6
Yunzhi Yao
Yunzhi Yao
Zhejiang University;Shandong University
Citations: 3,270
h-index: 22
Xinle Deng
Xinle Deng
Citations: 141
h-index: 6
Buqiang Xu
Buqiang Xu
Citations: 15
h-index: 2
Jizhan Fang
Jizhan Fang
Citations: 124
h-index: 4
Chiyu Wu
Chiyu Wu
Citations: 3,270
h-index: 4
Z. Xue
Z. Xue
Citations: 17
h-index: 2
Dian Chen
Dian Chen
Citations: 25
h-index: 4
C. Fu
C. Fu
Citations: 9
h-index: 1
Caiying Huang
Caiying Huang
Citations: 0
h-index: 0
Chenyu Jiang
Chenyu Jiang
Citations: 10
h-index: 1
Yijun Chen
Yijun Chen
Citations: 37
h-index: 2
Jingbo Shang
Jingbo Shang
Citations: 32
h-index: 2
Gong Yu
Gong Yu
Citations: 0
h-index: 0

As LLM agents are deployed in long-horizon sessions, context accumulation drives up inference costs. Existing approaches utilize text pruning or dynamic memory eviction to minimize token footprints; however, their unconstrained sequence mutations alter layouts, introducing prefix mismatches and cache invalidation. This reveals a critical trade-off between text sparsity and prompt cache continuity. To address this, we present TokenPilot, a dual-granularity context management framework. Globally, Ingestion-Aware Compaction acts as a framework harness to stabilize prompt prefixes and eliminate open-world environmental noise at the ingestion gate. Locally, Lifecycle-Aware Eviction monitors the ongoing residual utility of context segments, enforcing a conservative batch-turn schedule to offload content segments only when task relevance expires. Experiments on PinchBench and Claw-Eval under both isolated and continuous modes demonstrate that TokenPilot reduces costs by 61% and 56% in isolated mode, and 61% and 87% in continuous mode, while maintaining competitive performance compared to prior systems. TokenPilot has been integrated into LightMem2 at https://github.com/zjunlp/LightMem2.

0 Citations
0 Influential
46.89026915174 Altmetric
234.5 Score
Original PDF
23

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

Log in to request an AI analysis.

댓글

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!