2601.18113v2 Jan 26, 2026 cs.CR

MalURLBench: 웹 URL 처리 시 에이전트의 취약점을 평가하는 벤치마크

MalURLBench: A Benchmark Evaluating Agents' Vulnerabilities When Processing Web URLs

Dezhang Kong

Citations: 126

h-index: 6

Meng Han

Citations: 227

h-index: 9

Qichen Liu

Citations: 35

h-index: 2

Zhuxi Wu

Citations: 52

h-index: 2

Shiqi Liu

Citations: 76

h-index: 5

Zhicheng Tan

Citations: 0

h-index: 0

Kuichen Lu

Citations: 0

h-index: 0

Minghao Li

Citations: 109

h-index: 2

Shengyu Chu

Citations: 5

h-index: 1

Zhenhua Xu

Citations: 1

h-index: 1

Xuan Liu

Citations: 60

h-index: 1

LLM 기반 웹 에이전트는 일상생활 및 업무에서 유용성을 인정받아 점점 더 널리 사용되고 있습니다. 그러나 이러한 에이전트는 악성 URL을 처리할 때 심각한 취약점을 드러냅니다. 위장된 악성 URL을 수락하면 사용자가 안전하지 않은 웹 페이지에 접속하게 되어 서비스 제공업체와 사용자에게 심각한 피해를 줄 수 있습니다. 이러한 위험에도 불구하고, 현재 이 새로운 위협을 목표로 하는 벤치마크는 존재하지 않습니다. 이러한 격차를 해소하기 위해, 우리는 LLM의 악성 URL에 대한 취약점을 평가하는 첫 번째 벤치마크인 MalURLBench를 제안합니다. MalURLBench는 10가지 실제 시나리오와 7가지 범주의 실제 악성 웹사이트를 포함하는 61,845개의 공격 인스턴스로 구성되어 있습니다. 12개의 인기 LLM을 사용한 실험 결과, 기존 모델은 정교하게 위장된 악성 URL을 탐지하는 데 어려움을 겪는 것으로 나타났습니다. 또한, 공격 성공률에 영향을 미치는 주요 요인을 식별하고 분석했으며, 경량화된 방어 모듈인 URLGuard를 제안합니다. 우리는 이 연구가 웹 에이전트의 보안을 향상시키는 데 중요한 기반 자료가 될 것이라고 믿습니다. 저희의 코드는 https://github.com/JiangYingEr/MalURLBench 에서 확인할 수 있습니다.

Original Abstract

LLM-based web agents have become increasingly popular for their utility in daily life and work. However, they exhibit critical vulnerabilities when processing malicious URLs: accepting a disguised malicious URL enables subsequent access to unsafe webpages, which can cause severe damage to service providers and users. Despite this risk, no benchmark currently targets this emerging threat. To address this gap, we propose MalURLBench, the first benchmark for evaluating LLMs' vulnerabilities to malicious URLs. MalURLBench contains 61,845 attack instances spanning 10 real-world scenarios and 7 categories of real malicious websites. Experiments with 12 popular LLMs reveal that existing models struggle to detect elaborately disguised malicious URLs. We further identify and analyze key factors that impact attack success rates and propose URLGuard, a lightweight defense module. We believe this work will provide a foundational resource for advancing the security of web agents. Our code is available at https://github.com/JiangYingEr/MalURLBench.

0 Citations

0 Influential

32.547189562171 Altmetric

162.7 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!