2604.14687v1 Apr 16, 2026 cs.AI

M2-PALE: 프로세스 마이닝 및 LLM을 활용한 다중 에이전트 몬테카를로 트리 탐색-미니맥스 하이브리드 시스템 설명 프레임워크

M2-PALE: A Framework for Explaining Multi-Agent MCTS--Minimax Hybrids via Process Mining and LLMs

Yiyu Qian

Citations: 0

h-index: 0

Liyuan Zhao

Citations: 0

h-index: 0

Tim Miller

Citations: 0

h-index: 0

몬테카를로 트리 탐색(MCTS)은 순차적 의사 결정 영역에서 온라인 계획에 널리 사용되는 기본적인 샘플링 기반 탐색 알고리즘입니다. MCTS는 인공 지능 분야의 최근 발전에 크게 기여했지만, MCTS 에이전트의 동작을 이해하는 것은 개발자와 사용자 모두에게 여전히 어려운 과제입니다. 이러한 어려움은 수많은 미래 상태의 시뮬레이션을 통해 생성되는 복잡한 탐색 트리와 그 복잡한 관계에서 비롯됩니다. 표준 MCTS의 알려진 약점은 매우 선택적인 트리 구축에 의존한다는 점으로, 이는 중요한 수의 누락과 전술적인 함정에 취약하게 만들 수 있습니다. 이를 해결하기 위해, 우리는 전략적 깊이를 향상시키기 위해 다중 에이전트 MCTS의 롤아웃 단계에 얕고 전체 폭의 미니맥스 탐색을 통합했습니다. 또한, 결과적인 의사 결정 논리를 명확히 하기 위해, 우리는 MCTS--Minimax 프로세스 기반 언어적 설명을 지원하는 프레임워크인 extsf{M2-PALE}을 소개합니다. 이 프레임워크는 Alpha Miner, iDHM 및 Inductive Miner 알고리즘과 같은 프로세스 마이닝 기술을 사용하여 에이전트 실행 추적에서 기본적인 동작 워크플로우를 추출합니다. 이러한 프로세스 모델은 LLM에 의해 통합되어 사람이 읽을 수 있는 인과적 및 원격 설명을 생성합니다. 우리는 소규모 체커 환경에서 우리의 접근 방식의 효과를 입증했으며, 이를 통해 점점 더 복잡한 전략적 영역에서 하이브리드 에이전트를 해석하기 위한 확장 가능한 기반을 마련했습니다.

Original Abstract

Monte-Carlo Tree Search (MCTS) is a fundamental sampling-based search algorithm widely used for online planning in sequential decision-making domains. Despite its success in driving recent advances in artificial intelligence, understanding the behavior of MCTS agents remains a challenge for both developers and users. This difficulty stems from the complex search trees produced through the simulation of numerous future states and their intricate relationships. A known weakness of standard MCTS is its reliance on highly selective tree construction, which may lead to the omission of crucial moves and a vulnerability to tactical traps. To resolve this, we incorporate shallow, full-width Minimax search into the rollout phase of multi-agent MCTS to enhance strategic depth. Furthermore, to demystify the resulting decision-making logic, we introduce \textsf{M2-PALE} (MCTS--Minimax Process-Aided Linguistic Explanations). This framework employs process mining techniques, specifically the Alpha Miner, iDHM, and Inductive Miner algorithms, to extract underlying behavioral workflows from agent execution traces. These process models are then synthesized by LLMs to generate human-readable causal and distal explanations. We demonstrate the efficacy of our approach in a small-scale checkers environment, establishing a scalable foundation for interpreting hybrid agents in increasingly complex strategic domains.

0 Citations

0 Influential

0 Altmetric

0.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!