2604.26733v1 Apr 29, 2026 cs.AI

FutureWorld: 실시간 환경을 활용한 예측 에이전트 훈련 및 실제 결과 보상 시스템

FutureWorld: A Live Environment for Training Predictive Agents with Real-World Outcome Rewards

Yanzhi Zhang

Citations: 41

h-index: 3

Haoxiang Guan

Citations: 43

h-index: 3

Jiyan He

Citations: 45

h-index: 3

Shuxin Zheng

Citations: 68

h-index: 4

Zhixin Han

Citations: 51

h-index: 3

Zhuang Yu

Citations: 41

h-index: 2

Chuyang Wei

Citations: 5

h-index: 1

Mao Gao

Citations: 12

h-index: 1

Xiawei Yue

Citations: 386

h-index: 2

Jian Li

Citations: 67

h-index: 4

Yitong Duan

Citations: 35

h-index: 3

Mengting Hu

Citations: 107

h-index: 4

Ke Chen

Citations: 14

h-index: 3

Yu Shi

Citations: 682

h-index: 7

실시간 미래 예측은 실제 사건이 발생하기 전에 그 결과를 예측하는 작업입니다. 이 작업은 최근 대규모 언어 모델 기반 에이전트 시스템을 활용하여 연구되고 있으며, 실제 환경에서 지속적으로 학습하는 에이전트 개발에 중요합니다. 인터랙티브 환경이 에이전트 연구 발전에 기여해 온 것처럼, 실시간 미래 예측 또한 학습 환경으로 간주될 때 발전의 동기가 됩니다. 기존 연구들은 미래 예측을 다양한 측면에서 탐구했지만, 일반적으로는 통일된 학습 환경으로 정의하지 않았습니다. 이 작업은 다양한 실제 사건에 기반한 수많은 예측 질문을 제공하면서도 정답 노출을 방지할 수 있다는 점에서 학습에 매력적입니다. 실시간 미래 예측의 장점을 활용하기 위해, 저희는 예측, 결과 실현, 그리고 파라미터 업데이트 간의 학습 루프를 완성하는 실시간 에이전트 강화 학습 환경인 FutureWorld를 제안합니다. 저희는 이 환경에서 세 가지 오픈 소스 기반 모델을 사용하여 며칠 동안 훈련을 진행했습니다. 그 결과, 훈련이 효과적임을 확인했습니다. 또한, 저희는 이 환경을 기반으로 일일 벤치마크를 구축하고, 현재 에이전트 시스템의 성능 기준을 설정하기 위해 여러 최첨단 에이전트를 평가했습니다.

Original Abstract

Live future prediction refers to the task of making predictions about real-world events before they unfold. This task is increasingly studied using large language model-based agent systems, and it is important for building agents that can continually learn from real-world. Just as interactive environments have often driven progress in agents, advancing live future prediction naturally motivates viewing it as a learning environment. Prior works have explored future prediction from several different parts, but have generally not framed it as a unified learning environment. This task is appealing for learning because it can provide a large number of prediction questions grounded in diverse real-world events, while preventing answer leakage. To leverage the advantages of live future prediction, we present FutureWorld, a live agentic reinforcement learning environment that closes the training loop between prediction, outcome realization, and parameters update. In our environment, we take three open-source base models and train them for consecutive days. The results show that training is effective. Furthermore, we build a daily benchmark based on the environment and evaluate several frontier agents on it to establish performance baselines for current agent systems.

0 Citations

0 Influential

3.5 Altmetric

17.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!