2601.08605v1 Jan 13, 2026 cs.CL

ExpSeek: 웹 에이전트를 위한 자가 유발 경험 탐색

ExpSeek: Self-Triggered Experience Seeking for Web Agents

Xinghua Zhang

Citations: 411

h-index: 8

Haiyang Yu

Citations: 695

h-index: 13

Shuaiyi Nie

Citations: 75

h-index: 5

Bingli Wu

Citations: 176

h-index: 4

Juwei Yue

Citations: 213

h-index: 5

Tingwen Liu

Citations: 35

h-index: 3

Yongbin Li

Citations: 2,864

h-index: 19

Wenyuan Zhang

Citations: 267

h-index: 10

웹 에이전트에서의 경험 활용은 유망한 기술적 패러다임으로, 축적된 경험으로부터 얻은 귀중한 정보를 제공하여 에이전트의 상호 작용 능력을 향상시킬 수 있습니다. 그러나 기존 방법은 주로 작업 실행 전에 전역 컨텍스트로 경험을 수동적으로 주입하는 경향이 있으며, 에이전트-환경 상호 작용 중에 발생하는 동적으로 변화하는 컨텍스트 정보에 적응하는 데 어려움을 겪습니다. 본 연구에서는 ExpSeek를 제안하며, 이는 경험을 단계별로 적극적으로 탐색하는 방식으로 전환합니다. (1) 모델의 내재적 신호를 사용하여 단계별 엔트로피 임계값을 추정하여 개입 시점을 결정하고, (2) 단계별 맞춤형 경험 콘텐츠를 설계합니다. Qwen3-8B 및 32B 모델을 사용하여 4가지 어려운 웹 에이전트 벤치마크에서 실험한 결과, ExpSeek는 각각 9.3% 및 7.5%의 절대적인 성능 향상을 달성했습니다. 본 연구의 실험 결과는 엔트로피가 자가 유발 신호로서의 실현 가능성과 장점을 입증하며, 4B 규모의 작은 경험 모델조차도 더 큰 에이전트 모델의 성능을 크게 향상시킬 수 있음을 보여줍니다.

Original Abstract

Experience intervention in web agents emerges as a promising technical paradigm, enhancing agent interaction capabilities by providing valuable insights from accumulated experiences. However, existing methods predominantly inject experience passively as global context before task execution, struggling to adapt to dynamically changing contextual observations during agent-environment interaction. We propose ExpSeek, which shifts experience toward step-level proactive seeking: (1) estimating step-level entropy thresholds to determine intervention timing using the model's intrinsic signals; (2) designing step-level tailor-designed experience content. Experiments on Qwen3-8B and 32B models across four challenging web agent benchmarks demonstrate that ExpSeek achieves absolute improvements of 9.3% and 7.5%, respectively. Our experiments validate the feasibility and advantages of entropy as a self-triggering signal, reveal that even a 4B small-scale experience model can significantly boost the performance of larger agent models.

20 Citations

0 Influential

9.5 Altmetric

67.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!