2603.14688v1 Mar 16, 2026 cs.LG

AgentTrace: 배포된 다중 에이전트 시스템에서의 근본 원인 분석을 위한 인과 그래프 추적

AgentTrace: Causal Graph Tracing for Root Cause Analysis in Deployed Multi-Agent Systems

Citations: 7

h-index: 2

다중 에이전트 AI 시스템이 자동 고객 지원부터 DevOps 문제 해결에 이르기까지 다양한 실제 환경에 점점 더 많이 배포됨에 따라, 연쇄 효과, 숨겨진 의존성, 긴 실행 추적 등으로 인해 오류 진단이 더욱 어려워지고 있습니다. 본 논문에서는 배포된 다중 에이전트 워크플로우에서 사후 오류 진단을 위한 가벼운 인과 추적 프레임워크인 AgentTrace를 제시합니다. AgentTrace는 실행 로그에서 인과 그래프를 재구성하고, 오류 현상으로부터 역으로 추적하며, 해석 가능한 구조적 및 위치 정보를 사용하여 잠재적인 근본 원인을 순위를 매깁니다. AgentTrace는 디버깅 시 LLM 추론을 필요로 하지 않습니다. 다양한 일반적인 배포 패턴을 반영하는 다중 에이전트 오류 시나리오 벤치마크에서 AgentTrace는 높은 정확도와 서브초 단위의 지연 시간으로 근본 원인을 찾아내며, 휴리스틱 및 LLM 기반의 기존 방법보다 훨씬 뛰어난 성능을 보입니다. 본 연구 결과는 인과 추적이 실제 환경에서 에이전트 기반 시스템의 신뢰성과 안정성을 향상시키는 데 실질적인 기반을 제공한다는 것을 시사합니다.

Original Abstract

As multi-agent AI systems are increasingly deployed in real-world settings - from automated customer support to DevOps remediation - failures become harder to diagnose due to cascading effects, hidden dependencies, and long execution traces. We present AgentTrace, a lightweight causal tracing framework for post-hoc failure diagnosis in deployed multi-agent workflows. AgentTrace reconstructs causal graphs from execution logs, traces backward from error manifestations, and ranks candidate root causes using interpretable structural and positional signals - without requiring LLM inference at debugging time. Across a diverse benchmark of multi-agent failure scenarios designed to reflect common deployment patterns, AgentTrace localizes root causes with high accuracy and sub-second latency, significantly outperforming both heuristic and LLM-based baselines. Our results suggest that causal tracing provides a practical foundation for improving the reliability and trustworthiness of agentic systems in the wild.

1 Citations

0 Influential

1 Altmetric

6.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!