2604.15037v2 Apr 16, 2026 cs.AI

반응형에서 선제적으로: ProVoice-Bench를 활용한 음성 에이전트의 선제성 평가

From Reactive to Proactive: Assessing the Proactivity of Voice Agents via ProVoice-Bench

Yuhao Wang

Citations: 199

h-index: 7

Yu Wang

Citations: 276

h-index: 9

Kewei Xu

Citations: 11

h-index: 2

최근 LLM 에이전트 기술은 반응적인 텍스트 기반 상호작용에서 선제적이고 다중 모드 상호작용으로 점진적으로 변화하고 있습니다. 그러나 기존 벤치마크는 주로 반응형 응답에 초점을 맞추고 있으며, 선제적 개입 및 모니터링의 복잡성을 간과합니다. 이러한 격차를 해소하기 위해, 우리는 음성 에이전트의 선제성을 평가하기 위해 특별히 설계된 최초의 평가 프레임워크인 ProVoice-Bench를 소개합니다. ProVoice-Bench는 네 가지 새로운 작업으로 구성되어 있습니다. 다단계 데이터 합성 파이프라인을 활용하여, 엄격한 테스트를 위한 1,182개의 고품질 샘플을 수집했습니다. 최첨단 멀티모달 LLM에 대한 우리의 평가는 상당한 성능 격차를 보여주며, 특히 과도한 활성화 및 추론 능력 측면에서 두드러집니다. 이러한 결과는 현재 모델의 한계를 강조하며, 보다 자연스럽고 상황 인지적인 선제적 에이전트를 개발하기 위한 로드맵을 제시합니다.

Original Abstract

Recent advancements in LLM agents are gradually shifting from reactive, text-based paradigms toward proactive, multimodal interaction. However, existing benchmarks primarily focus on reactive responses, overlooking the complexities of proactive intervention and monitoring. To bridge this gap, we introduce ProVoice-Bench, the first evaluation framework specifically designed for proactive voice agents, featuring four novel tasks. By leveraging a multi-stage data synthesis pipeline, we curate 1,182 high-quality samples for rigorous testing. Our evaluation of state-of-the-art Multimodal LLMs reveals a significant performance gap, particularly regarding over-triggering and reasoning capabilities. These findings highlight the limitations of current models and offer a roadmap for developing more natural, context-aware proactive agents.

0 Citations

0 Influential

4.5 Altmetric

22.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!