2603.24936v1 Mar 26, 2026 cs.CV

TIGFlow-GRPO: 상호작용 인지 흐름 매칭과 보상 기반 최적화를 통한 경로 예측

TIGFlow-GRPO: Trajectory Forecasting via Interaction-Aware Flow Matching and Reward-Driven Optimization

Wenhuan Lu

Citations: 18

h-index: 2

Xuepeng Jing

Citations: 0

h-index: 0

Haoliang Meng

Citations: 20

h-index: 2

Zhizhi Yu

Citations: 770

h-index: 9

Jianguo Wei

Citations: 21

h-index: 2

인간 경로 예측은 자율 주행 및 군중 감시와 같이 시각적으로 복잡한 환경에서 작동하는 지능형 멀티미디어 시스템에 매우 중요합니다. 조건부 흐름 매칭(CFM)은 시공간적 관찰로부터 경로 분포를 모델링하는 데 강력한 능력을 보여주었지만, 기존 접근 방식은 주로 지도 학습에 초점을 맞추고 있으며, 이는 생성된 경로에 사회적 규범 및 장면 제약 조건이 충분히 반영되지 못하는 문제를 야기할 수 있습니다. 이러한 문제를 해결하기 위해, 우리는 행동 규칙과 흐름 기반 경로 생성을 연결하는 두 단계의 생성 프레임워크인 TIGFlow-GRPO를 제안합니다. 첫 번째 단계에서는 경로-상호작용-그래프(TIG) 모듈을 갖춘 CFM 기반 예측기를 구축하여 미세한 시각-공간적 상호작용을 모델링하고 컨텍스트 인코딩을 강화합니다. 이 단계는 에이전트-에이전트 및 에이전트-장면 관계를 보다 효과적으로 캡처하여 후속 정렬을 위한 더 유용한 조건부 특징을 제공합니다. 두 번째 단계에서는 흐름-GRPO 사후 훈련을 수행하며, 결정론적 흐름 롤아웃을 확률적 ODE-to-SDE 샘플링으로 재구성하여 경로 탐색을 가능하게 하고, 뷰 인식 사회적 준수와 지도 인식 물리적 타당성을 결합한 복합 보상을 사용합니다. SDE 롤아웃을 통해 탐색된 경로를 평가함으로써, GRPO는 점진적으로 다중 모드 예측을 행동적으로 타당한 미래로 유도합니다. ETH/UCY 및 SDD 데이터 세트에 대한 실험 결과, TIGFlow-GRPO는 예측 정확도와 장기적인 안정성을 향상시키면서 동시에 사회적으로 더 순응적이고 물리적으로 타당한 경로를 생성하는 것으로 나타났습니다. 이러한 결과는 제안된 프레임워크가 동적 멀티미디어 환경에서 흐름 기반 경로 모델링과 행동 인지 정렬을 연결하는 효과적인 방법을 제공한다는 것을 시사합니다.

Original Abstract

Human trajectory forecasting is important for intelligent multimedia systems operating in visually complex environments, such as autonomous driving and crowd surveillance. Although Conditional Flow Matching (CFM) has shown strong ability in modeling trajectory distributions from spatio-temporal observations, existing approaches still focus primarily on supervised fitting, which may leave social norms and scene constraints insufficiently reflected in generated trajectories. To address this issue, we propose TIGFlow-GRPO, a two-stage generative framework that aligns flow-based trajectory generation with behavioral rules. In the first stage, we build a CFM-based predictor with a Trajectory-Interaction-Graph (TIG) module to model fine-grained visual-spatial interactions and strengthen context encoding. This stage captures both agent-agent and agent-scene relations more effectively, providing more informative conditional features for subsequent alignment. In the second stage, we perform Flow-GRPO post-training,where deterministic flow rollout is reformulated as stochastic ODE-to-SDE sampling to enable trajectory exploration, and a composite reward combines view-aware social compliance with map-aware physical feasibility. By evaluating trajectories explored through SDE rollout, GRPO progressively steers multimodal predictions toward behaviorally plausible futures. Experiments on the ETH/UCY and SDD datasets show that TIGFlow-GRPO improves forecasting accuracy and long-horizon stability while generating trajectories that are more socially compliant and physically feasible. These results suggest that the proposed framework provides an effective way to connect flow-based trajectory modeling with behavior-aware alignment in dynamic multimedia environments.

0 Citations

0 Influential

4.5 Altmetric

22.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!