2601.07122v1 Jan 12, 2026 cs.CR

강력한 LLM 기반 다중 에이전트 강화 학습 프레임워크를 통한 클라우드 네트워크 복원력 강화

Enhancing Cloud Network Resilience via a Robust LLM-Empowered Multi-Agent Reinforcement Learning Framework

Xinye Cao

Citations: 141

h-index: 4

Guoshun Nan

Citations: 4

h-index: 1

Yixiao Peng

Citations: 2

h-index: 1

Hao Hu

Citations: 3

h-index: 1

Feiyang Li

Citations: 3

h-index: 1

Yingchang Jiang

Citations: 3

h-index: 1

Jipeng Tang

Citations: 2

h-index: 1

Yuling Liu

Citations: 261

h-index: 8

가상화 및 자원 풀링은 클라우드 네트워크에 구조적 유연성과 탄력적인 확장성을 제공하지만, 필연적으로 공격 표면을 넓히고 사이버 복원력에 대한 과제를 야기합니다. 강화 학습(RL) 기반 방어 전략은 적대적인 환경에서 자원 배포 및 격리 정책을 최적화하여 네트워크 가용성을 유지하고 복원함으로써 시스템 복원력을 향상시키는 것을 목표로 개발되어 왔습니다. 그러나 기존 접근 방식은 네트워크 구조, 노드 규모, 공격 전략 및 공격 강도의 동적 변화에 적응하기 위해 재훈련이 필요하기 때문에 견고성이 부족합니다. 또한, 인간 개입(HITL) 지원의 부족은 해석 가능성과 유연성을 제한합니다. 이러한 제한 사항을 해결하기 위해, 우리는 대규모 언어 모델(LLM)에 의해 강화된 계층적 다중 에이전트 강화 학습 프레임워크인 CyberOps-Bots를 제안합니다. MITRE ATT&CK의 전술-기술 모델에서 영감을 받은 CyberOps-Bots는 두 가지 계층 구조를 갖습니다. (1) 상위 계층의 LLM 에이전트는 ReAct 계획, IPDRR 기반 인식, 장단기 메모리 및 액션/도구 통합의 네 가지 모듈을 통해 전반적인 인지, 인간 의도 인식 및 전술 계획을 수행합니다. (2) 하위 계층의 RL 에이전트는 이기종 분리 사전 훈련을 통해 개발되어 로컬 네트워크 영역 내에서 원자적 방어 액션을 실행합니다. 이러한 시너지 효과는 LLM의 적응성과 해석 가능성을 유지하는 동시에 안정적인 RL 실행을 보장합니다. 실제 클라우드 데이터 세트에 대한 실험 결과, CyberOps-Bots는 최첨단 알고리즘에 비해 네트워크 가용성을 68.5% 더 높게 유지하며, 재훈련 없이 시나리오를 변경할 때 34.7%의 성능 향상을 달성합니다. 현재까지, 우리는 클라우드 방어를 위한 HITL 지원을 갖춘 강력한 LLM-RL 프레임워크를 구축한 최초의 연구입니다. 우리는 우리의 프레임워크를 커뮤니티에 공개하여 클라우드 네트워크에서 강력하고 자율적인 방어 기술의 발전을 촉진할 것입니다.

Original Abstract

While virtualization and resource pooling empower cloud networks with structural flexibility and elastic scalability, they inevitably expand the attack surface and challenge cyber resilience. Reinforcement Learning (RL)-based defense strategies have been developed to optimize resource deployment and isolation policies under adversarial conditions, aiming to enhance system resilience by maintaining and restoring network availability. However, existing approaches lack robustness as they require retraining to adapt to dynamic changes in network structure, node scale, attack strategies, and attack intensity. Furthermore, the lack of Human-in-the-Loop (HITL) support limits interpretability and flexibility. To address these limitations, we propose CyberOps-Bots, a hierarchical multi-agent reinforcement learning framework empowered by Large Language Models (LLMs). Inspired by MITRE ATT&CK's Tactics-Techniques model, CyberOps-Bots features a two-layer architecture: (1) An upper-level LLM agent with four modules--ReAct planning, IPDRR-based perception, long-short term memory, and action/tool integration--performs global awareness, human intent recognition, and tactical planning; (2) Lower-level RL agents, developed via heterogeneous separated pre-training, execute atomic defense actions within localized network regions. This synergy preserves LLM adaptability and interpretability while ensuring reliable RL execution. Experiments on real cloud datasets show that, compared to state-of-the-art algorithms, CyberOps-Bots maintains network availability 68.5% higher and achieves a 34.7% jumpstart performance gain when shifting the scenarios without retraining. To our knowledge, this is the first study to establish a robust LLM-RL framework with HITL support for cloud defense. We will release our framework to the community, facilitating the advancement of robust and autonomous defense in cloud networks.

0 Citations

0 Influential

4 Altmetric

20.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!