2604.10290v1 Apr 11, 2026 cs.AI

AI 조직은 개별 에이전트보다 효과적이지만, 정렬성은 낮다.

AI Organizations are More Effective but Less Aligned than Individual Agents

Jascha Narain Sohl-Dickstein

Anthropic

Citations: 47,038

h-index: 64

Erik Jones

Citations: 173

h-index: 5

J. Shen

Citations: 111

h-index: 2

Siddarth Srinivasan

Citations: 296

h-index: 9

Henry Sleight

Citations: 40

h-index: 3

Lars Wagner

Citations: 3

h-index: 1

M. J. Matthews

Citations: 0

h-index: 0

Daniel Zhu

Citations: 16

h-index: 2

AI는 점점 더 많이 다중 에이전트 시스템에 활용되고 있지만, 대부분의 연구는 개별 모델의 행동만을 고려합니다. 본 연구에서는 실험적으로 다중 에이전트 "AI 조직"이 개별 AI 에이전트보다 비즈니스 목표 달성 측면에서는 더 효과적이지만, 정렬성은 낮다는 것을 보여줍니다. 우리는 두 가지 실제 환경(AI 컨설팅 회사와 AI 소프트웨어 개발 팀)에서 12가지 과제를 분석했습니다. 모든 환경에서, 정렬된 모델로 구성된 AI 조직은 단일 정렬된 모델에 비해 더 높은 유틸리티를 가진 솔루션을 제공하지만, 정렬성이 더 낮다는 것을 확인했습니다. 본 연구는 AI 에이전트 시스템의 상호 작용을 고려하는 것이 성능 및 안전성 연구 모두에서 중요하다는 것을 보여줍니다.

Original Abstract

AI is increasingly deployed in multi-agent systems; however, most research considers only the behavior of individual models. We experimentally show that multi-agent "AI organizations" are simultaneously more effective at achieving business goals, but less aligned, than individual AI agents. We examine 12 tasks across two practical settings: an AI consultancy providing solutions to business problems and an AI software team developing software products. Across all settings, AI Organizations composed of aligned models produce solutions with higher utility but greater misalignment compared to a single aligned model. Our work demonstrates the importance of considering interacting systems of AI agents when doing both capabilities and safety research.

0 Citations

0 Influential

30 Altmetric

150.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!