2603.04750v1 Mar 05, 2026 cs.AI

HiMAP-Travel: 장기 계획 및 제약 조건이 있는 여행을 위한 계층적 다중 에이전트 계획

HiMAP-Travel: Hierarchical Multi-Agent Planning for Long-Horizon Constrained Travel

The Viet Bui

Citations: 22

h-index: 3

Wenjun Li

Citations: 8

h-index: 2

Yong Liu

Citations: 2,444

h-index: 9

순차적인 LLM 에이전트는 예산 및 다양성 요구 사항과 같은 엄격한 제약 조건이 있는 장기 계획에서 어려움을 겪습니다. 계획이 진행되고 컨텍스트가 증가함에 따라 이러한 에이전트는 전역적 제약 조건에서 벗어나는 경향이 있습니다. 본 논문에서는 계획을 전략적 조정과 병렬적인 일 단위 실행으로 분리하는 계층적 다중 에이전트 프레임워크인 HiMAP-Travel을 제안합니다. 조정자는 자원을 일 단위로 할당하고, 일 단위 실행자는 독립적으로 병렬적으로 계획을 수행합니다. 세 가지 핵심 메커니즘이 이를 가능하게 합니다. 첫째, 병렬 에이전트에 대한 예산 및 고유성 제약을 시행하는 트랜잭션 모니터입니다. 둘째, 에이전트가 실현 불가능한 하위 목표를 거부하고 재계획을 트리거할 수 있도록 하는 협상 프로토콜입니다. 셋째, 모든 에이전트를 역할 조건부 방식으로 작동시키는 GRPO로 학습된 단일 정책입니다. TravelPlanner 데이터셋에서 HiMAP-Travel은 Qwen3-8B 모델을 사용하여 52.78%의 검증 정확도와 52.65%의 테스트 최종 성공률(FPR)을 달성했습니다. 동일한 모델, 훈련 방식 및 도구를 사용한 통제된 비교에서 HiMAP-Travel은 순차적인 DeepTravel 기준 모델보다 +8.67%p 더 높은 성능을 보였습니다. 또한 ATLAS 모델보다 +17.65%p, MTP 모델보다 +10.0%p 더 높은 성능을 보였습니다. FlexTravelBench의 다중 턴 시나리오에서 HiMAP-Travel은 44.34% (2턴) 및 37.42% (3턴)의 FPR을 달성하면서 병렬화를 통해 지연 시간을 2.5배 줄였습니다.

Original Abstract

Sequential LLM agents fail on long-horizon planning with hard constraints like budgets and diversity requirements. As planning progresses and context grows, these agents drift from global constraints. We propose HiMAP-Travel, a hierarchical multi-agent framework that splits planning into strategic coordination and parallel day-level execution. A Coordinator allocates resources across days, while Day Executors plan independently in parallel. Three key mechanisms enable this: a transactional monitor enforcing budget and uniqueness constraints across parallel agents, a bargaining protocol allowing agents to reject infeasible sub-goals and trigger re-planning, and a single policy trained with GRPO that powers all agents through role conditioning. On TravelPlanner, HiMAP-Travel with Qwen3-8B achieves 52.78% validation and 52.65% test Final Pass Rate (FPR). In a controlled comparison with identical model, training, and tools, it outperforms the sequential DeepTravel baseline by +8.67~pp. It also surpasses ATLAS by +17.65~pp and MTP by +10.0~pp. On FlexTravelBench multi-turn scenarios, it achieves 44.34% (2-turn) and 37.42% (3-turn) FPR while reducing latency 2.5x through parallelization.

0 Citations

0 Influential

4.5 Altmetric

22.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!