2603.09072v1 Mar 10, 2026 cs.HC

생성 영상 제작을 위한 텍스트 기반 인터페이스

A Text-Native Interface for Generative Video Authoring

Dingzeyu Li

Citations: 297

h-index: 5

Mira Dontcheva

Citations: 560

h-index: 4

Xingyu Bruce Liu

UCLA

Citations: 417

h-index: 8

모든 사람은 학교에서 배우듯이 자유 형식의 텍스트로 자신의 이야기를 쓸 수 있습니다. 그러나 영상을 통한 스토리텔링은 특수하고 복잡한 도구를 익혀야 합니다. 본 논문에서는 생성 영상 제작을 위한 텍스트 기반 인터페이스인 Doki를 소개합니다. Doki는 영상 제작 과정을 텍스트 작성의 자연스러운 과정과 일치시킵니다. Doki에서 텍스트 작성은 주요 상호 작용 방식이며, 사용자는 단일 문서 내에서 자산을 정의하고, 장면을 구성하고, 샷을 만들고, 편집을 개선하고, 오디오를 추가할 수 있습니다. 우리는 이 텍스트 우선 접근 방식의 설계 원칙을 설명하고, 다양한 예시를 통해 Doki의 기능을 보여줍니다. 실제 사용성을 평가하기 위해, 우리는 영상 제작 경험이 다양한 참가자들과 함께 일주일 동안 Doki를 사용하도록 하는 실험을 진행했습니다. 이 연구는 생성 영상 인터페이스에 있어 근본적인 변화를 가져왔으며, 시각적 스토리를 제작하는 강력하고 접근 가능한 새로운 방법을 제시합니다.

Original Abstract

Everyone can write their stories in freeform text format -- it's something we all learn in school. Yet storytelling via video requires one to learn specialized and complicated tools. In this paper, we introduce Doki, a text-native interface for generative video authoring, aligning video creation with the natural process of text writing. In Doki, writing text is the primary interaction: within a single document, users define assets, structure scenes, create shots, refine edits, and add audio. We articulate the design principles of this text-first approach and demonstrate Doki's capabilities through a series of examples. To evaluate its real-world use, we conducted a week-long deployment study with participants of varying expertise in video authoring. This work contributes a fundamental shift in generative video interfaces, demonstrating a powerful and accessible new way to craft visual stories.

0 Citations

0 Influential

4 Altmetric

20.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!