2605.01789v1 May 03, 2026 cs.AI

DataEvolver: 목표 지향적 루프 에이전트를 통해 데이터가 스스로 구축되고 개선되도록 하는 시스템

DataEvolver: Let Your Data Build and Improve Itself via Goal-Driven Loop Agents

Kongming Liang

Citations: 12

h-index: 1

Qisong Zhang

Citations: 0

h-index: 0

Wenzhuo Wu

Citations: 86

h-index: 5

Zhuangzhuang Jia

Citations: 663

h-index: 13

Yunhao Yang

Citations: 37

h-index: 3

Huayu Zhang

Citations: 26

h-index: 2

Xianghao Zang

Citations: 234

h-index: 8

Zhixiang He

Citations: 56

h-index: 2

Zhongjiang He

Citations: 65

h-index: 4

Zhanyu Ma School of Artificial Intelligence

Citations: 0

h-index: 0

Beijing University of Posts

Citations: 0

h-index: 0

Telecommunications

Citations: 793

h-index: 6

Institute for Artificial Intelligence

Citations: 0

h-index: 0

China Telecom

Citations: 7

h-index: 2

제어 가능한 시각 데이터를 구축하는 것은 이미지 편집 및 다중 모드 이해의 주요 장애물입니다. 유용한 감독 신호는 일반적으로 단일 렌더링 과정에서 생성되지 않으며, 대신 반복적인 생성, 검사, 수정, 필터링 및 내보내기를 통해 점진적으로 나타납니다. 본 논문에서는 명시적인 목표, 지속적인 결과물, 제한된 수정 작업, 그리고 수락 결정을 중심으로 이러한 과정을 구성하는 폐쇄 루프 시각 데이터 엔진인 DataEvolver를 소개합니다. DataEvolver는 RGB 이미지, 마스크, 깊이 맵, 노멀 맵, 메시, 포즈, 궤적, 그리고 검토 기록 등 다양한 유형의 결과물을 지원합니다. 현재 릴리스 버전에서, 이 시스템은 두 개의 결합된 루프를 통해 작동합니다. 첫 번째는 각 샘플 내에서 생성 시간을 기준으로 하는 자체 수정이며, 두 번째는 데이터셋 반복 과정 전반에 걸친 검증 시간을 기준으로 하는 자체 확장을 수행합니다. 저희는 이 프레임워크를 이미지 레벨 객체 회전 설정에서 검증했습니다. 고정된 Qwen-Edit LoRA 프로브를 사용했을 때, 저희의 최종 Ours+DualGate 모델은 SpatialEdit 및 별도의 평가 세트에서 수정되지 않은 기본 모델과 공개된 다각 LoRA 모델 모두를 능가했습니다. 실험 결과는 장면 인지 기반 생성에서 피드백 기반 수정 및 이중 게이트 검증으로 이어지는 일관된 성능 향상 경로를 보여줍니다. 공개된 회전 데이터 외에도, 저희의 주요 기여는 명시적인 목표 추적, 검토, 수정 및 수락 루프를 통해 시각 데이터셋을 구축하기 위한 재사용 가능한 프레임워크를 제공하는 것입니다.

Original Abstract

Constructing controllable visual data is a major bottleneck for image editing and multimodal understanding. Useful supervision is rarely produced by a single rendering pass; instead it emerges through iterative generation, inspection, correction, filtering, and export. We present DataEvolver, a closed-loop visual data engine that organizes this process around explicit goals, persistent artifacts, bounded corrective actions, and acceptance decisions. DataEvolver supports multiple artifact types, including RGB images, masks, depth maps, normal maps, meshes, poses, trajectories, and review traces. In the current release, the system operates through two coupled loops: generation-time self-correction within each sample and validation-time self-expansion across dataset rounds. We validate the framework on an image-level object-rotation setting. With a fixed Qwen-Edit LoRA probe, our final Ours+DualGate model outperforms both the unadapted base model and a public multi-angle LoRA on SpatialEdit and a held-out evaluation set. Ablations show a consistent improvement path from scene-aware generation to feedback-driven correction and dual-gated validation. Beyond the released rotation data, our main contribution is a reusable framework for building visual datasets through explicit goal tracking, review, correction, and acceptance loops.

0 Citations

0 Influential

6.5 Altmetric

32.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!