2601.16596v1 Jan 23, 2026 cs.CL

Attention-MoA: 에이전트 간 의미적 주의 메커니즘과 심층 잔차 합성 기반의 Mixture-of-Agents 성능 향상

Attention-MoA: Enhancing Mixture-of-Agents via Inter-Agent Semantic Attention and Deep Residual Synthesis

Ke Zeng

Citations: 56

h-index: 4

Ji-Rong Wen

Citations: 724

h-index: 7

Yang Wei

Citations: 35

h-index: 3

Xiongxi Yu

Citations: 0

h-index: 0

Chang Xiao

Citations: 10

h-index: 2

최근 대규모 언어 모델(LLM) 개발 동향이 파라미터 확장에서 추론 시간 협업으로 전환됨에 따라, 다양한 모델을 결합하여 집단 지능을 활용하는 일반적인 방법론인 Mixture-of-Agents (MoA) 프레임워크가 부상하고 있습니다. 기존 MoA 모델들은 효율성 향상을 위해 동적 라우팅 및 잔차 연결을 도입했지만, 이러한 방법들은 종종 에이전트 간의 깊은 의미적 상호작용을 촉진하지 못하여 시스템의 환각 수정 및 논리 개선 능력을 제한합니다. 본 논문에서는 에이전트 간 의미적 주의 메커니즘을 통해 협업 방식을 재정의하는 새로운 MoA 기반 프레임워크인 Attention-MoA를 소개합니다. 또한, 적응형 조기 종료 메커니즘을 갖춘 Inter-layer Residual 모듈을 통해 심층 레이어에서의 정보 손실을 완화하고 계산 효율성을 향상시킵니다. AlpacaEval 2.0, MT-Bench 및 FLASK에 대한 광범위한 실험 결과, Attention-MoA는 최첨단 모델들을 크게 능가하며, AlpacaEval 2.0에서 91.15%의 길이 제어 우세율을 달성하고, FLASK의 12가지 기능 중 10가지에서 우수한 성능을 보였습니다. 특히, Attention-MoA는 작은 오픈 소스 모델들의 앙상블이 Claude-4.5-Sonnet 및 GPT-4.1과 같은 거대 독점 모델보다 뛰어난 성능을 발휘하며, MT-Bench에서 8.83의 점수를, AlpacaEval 2.0에서 77.36%의 길이 제어 우세율을 달성했습니다.

Original Abstract

As the development of Large Language Models (LLMs) shifts from parameter scaling to inference-time collaboration, the Mixture-of-Agents (MoA) framework has emerged as a general paradigm to harness collective intelligence by layering diverse models. While recent MoA variants have introduced dynamic routing and residual connections to improve efficiency, these methods often fail to facilitate deep semantic interaction between agents, limiting the system's ability to actively correct hallucinations and refine logic. In this paper, we introduce Attention-MoA, a novel MoA-based framework that redefines collaboration through Inter-agent Semantic Attention. Complemented by an Inter-layer Residual Module with Adaptive Early Stopping Mechanism, our architecture mitigates information degradation in deep layers while improving computational efficiency. Extensive evaluations across AlpacaEval 2.0, MT-Bench, and FLASK demonstrate that Attention-MoA significantly outperforms state-of-the-art baselines, achieving a 91.15% Length-Controlled Win Rate on AlpacaEval 2.0 and dominating in 10 out of 12 capabilities on FLASK. Notably, Attention-MoA enables an ensemble of small open-source models to outperform massive proprietary models like Claude-4.5-Sonnet and GPT-4.1, achieving an MT-Bench score of 8.83 and an AlpacaEval 2.0 LC Win Rate of 77.36%.

0 Citations

0 Influential

3.5 Altmetric

17.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!