2604.17821v2 Apr 20, 2026 cs.AI

WebUncertainty: 자율 웹 에이전트를 위한 이중 수준의 불확실성 기반 계획 및 추론

WebUncertainty: Dual-Level Uncertainty Driven Planning and Reasoning For Autonomous Web Agent

Kuien Liu

Citations: 4

h-index: 1

Lingfeng Zhang

Citations: 4

h-index: 1

Yongan Sun

Citations: 35

h-index: 2

Jinpeng Hu

Hefei University of Technology

Citations: 797

h-index: 15

Hui Ma

Citations: 22

h-index: 3

Zenglin Shi

Citations: 13

h-index: 1

Meng Wang

Citations: 12

h-index: 3

Yang Ying

Citations: 1

h-index: 1

최근 대규모 언어 모델(LLM)의 발전으로 인해 자율 웹 에이전트는 실제 웹 페이지에서 자연어 명령을 직접 실행할 수 있게 되었습니다. 그러나 기존 에이전트는 경직된 계획 전략과 환각에 취약한 추론으로 인해 복잡한 작업, 특히 동적 상호 작용과 장기 실행이 필요한 작업에서 어려움을 겪는 경우가 많습니다. 이러한 한계를 해결하기 위해, 우리는 계획 및 추론에서 이중 수준의 불확실성을 해결하도록 설계된 새로운 자율 에이전트 프레임워크인 WebUncertainty를 제안합니다. 구체적으로, 우리는 알려지지 않은 환경을 탐색하기 위해 계획 모드를 적응적으로 선택하는 작업 불확실성 기반 적응적 계획 메커니즘을 설계했습니다. 또한, 우리는 행동 불확실성 기반 몬테카를로 트리 탐색(MCTS) 추론 메커니즘을 도입했습니다. 이 메커니즘은 신뢰도 기반 행동 불확실성(ConActU) 전략을 통합하여 확률적 불확실성(AU)과 인식론적 불확실성(EU)을 모두 정량화함으로써 탐색 프로세스를 최적화하고 강력한 의사 결정을 안내합니다. WebArena 및 WebVoyager 벤치마크에 대한 실험 결과는 WebUncertainty가 최첨단 모델과 비교하여 우수한 성능을 달성한다는 것을 보여줍니다.

Original Abstract

Recent advancements in large language models (LLMs) have empowered autonomous web agents to execute natural language instructions directly on real-world webpages. However, existing agents often struggle with complex tasks involving dynamic interactions and long-horizon execution due to rigid planning strategies and hallucination-prone reasoning. To address these limitations, we propose WebUncertainty, a novel autonomous agent framework designed to tackle dual-level uncertainty in planning and reasoning. Specifically, we design a Task Uncertainty-Driven Adaptive Planning Mechanism that adaptively selects planning modes to navigate unknown environments. Furthermore, we introduce an Action Uncertainty-Driven Monte Carlo tree search (MCTS) Reasoning Mechanism. This mechanism incorporates the Confidence-induced Action Uncertainty (ConActU) strategy to quantify both aleatoric uncertainty (AU) and epistemic uncertainty (EU), thereby optimizing the search process and guiding robust decision-making. Experimental results on the WebArena and WebVoyager benchmarks demonstrate that WebUncertainty achieves superior performance compared to state-of-the-art baselines.

1 Citations

0 Influential

7.5 Altmetric

38.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!