2605.29801v1 May 28, 2026 cs.AI

AgentDoG 1.5: A Lightweight and Scalable Alignment Framework for AI Agent Safety and Security

Yuejin Xie
Yuejin Xie
Citations: 48
h-index: 4
Zhonghao Yang
Zhonghao Yang
Citations: 27
h-index: 2
Qingyu Liu
Qingyu Liu
Citations: 150
h-index: 3
Lei Yuan
Lei Yuan
Citations: 34
h-index: 2
Qihao Lin
Qihao Lin
Citations: 28
h-index: 2
Jialing Tao
Jialing Tao
Citations: 170
h-index: 7
Quanshi Zhang
Quanshi Zhang
Citations: 160
h-index: 7
Lei Zhu
Lei Zhu
Citations: 25
h-index: 3
Chaochao Lu
Chaochao Lu
Citations: 59
h-index: 4
Xingjun Ma
Xingjun Ma
Citations: 25
h-index: 4
Ranjie Duan
Ranjie Duan
Citations: 1,082
h-index: 12
Chen Qian
Chen Qian
Citations: 307
h-index: 8
Xia Hu
Xia Hu
Citations: 11
h-index: 2
Zhiheng Xi
Zhiheng Xi
Citations: 1,507
h-index: 17
Dongrui Liu
Dongrui Liu
Citations: 49
h-index: 3
Wenjie Wang
Wenjie Wang
Citations: 31
h-index: 3
Tianhang Zheng
Tianhang Zheng
Citations: 45
h-index: 4
Shuai Shao
Shuai Shao
Citations: 114
h-index: 3
Xianglong Liu
Xianglong Liu
Citations: 82
h-index: 5
Xi Lin
Xi Lin
Citations: 13
h-index: 2
Yu Li
Yu Li
Citations: 6
h-index: 1
Bo Zhang
Bo Zhang
Citations: 75
h-index: 4
Qinghua Mao
Qinghua Mao
Citations: 46
h-index: 3
Ruiyang Qin
Ruiyang Qin
Citations: 23
h-index: 1
Yanxu Zhu
Yanxu Zhu
Citations: 72
h-index: 4
Xiangnan He
Xiangnan He
Citations: 941
h-index: 18
Hui Xue
Hui Xue
Citations: 8
h-index: 2
Peng Wang
Peng Wang
Citations: 0
h-index: 0
Wanying Qu
Wanying Qu
Citations: 14
h-index: 2
Tianyi Zhou
Tianyi Zhou
Citations: 7
h-index: 2
Haoyu Luo
Haoyu Luo
Citations: 0
h-index: 0
Qihan Ren
Qihan Ren
Citations: 87
h-index: 4
Junxiao Yang
Junxiao Yang
Citations: 505
h-index: 7
Linfeng Zhang
Linfeng Zhang
Citations: 19
h-index: 2
Wen Shen
Wen Shen
Citations: 148
h-index: 7
Qiaosheng Zhang
Qiaosheng Zhang
Citations: 140
h-index: 5
Y. Teng
Y. Teng
Citations: 12
h-index: 2
Rui Mei
Rui Mei
Software Security Research Group(SSRG), Peking University
Citations: 53
h-index: 4
Yong Liu
Yong Liu
Citations: 14
h-index: 2
X.Z. Zuo
X.Z. Zuo
Citations: 3
h-index: 1
Chaoyi Shen
Chaoyi Shen
Citations: 50
h-index: 5
Jing Shao
Jing Shao
Citations: 8
h-index: 2
Guan-Lin Chen
Guan-Lin Chen
Citations: 88
h-index: 6
Yiming Wang
Yiming Wang
Citations: 151
h-index: 5
Ling Tang
Ling Tang
Citations: 12
h-index: 2
Kun Wang
Kun Wang
Citations: 21
h-index: 3
Man Li
Man Li
Citations: 0
h-index: 0
Junhua Liu
Junhua Liu
Citations: 0
h-index: 0
Min Huang
Min Huang
Citations: 18
h-index: 2
Zhijie Zheng
Zhijie Zheng
Citations: 76
h-index: 2

Modern open-world agents such as OpenClaw exhibit powerful cross-environment execution capabilities yet introduce broad new safety risk sources. Meanwhile, advanced frontier AI models drastically lower attack barriers, rendering current agent alignment frameworks inadequate for real-world deployment. To tackle these emerging threats, we propose a lightweight and scalable agent safety alignment framework. Specifically, we update the agent safety taxonomy to accommodate emergent risks from Codex and OpenClaw execution scenarios. We further build a taxonomy-guided data engine with influence-function purification to train lightweight AgentDoG 1.5 variants (0.8B, 2B, 4B, and 8B parameters) using only around 1k samples, achieving comparable performance with leading closed-source models (e.g., GPT-5.4). Based on AgentDoG 1.5, we construct a highly efficient agentic safety SFT and RL training environment, which reduces deployment overhead in Docker-level environments by two orders of magnitude. Finally, we deploy AgentDoG 1.5 as a training-free online guardrail for real-time safety moderation. Extensive experimental results indicate that AgentDoG 1.5 achieves state-of-the-art performance in diverse and complex interactive agentic scenarios. All models and datasets are openly released.

0 Citations
0 Influential
9 Altmetric
45.0 Score
Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

Log in to request an AI analysis.

댓글

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!