2604.03004v1 Apr 03, 2026 cs.CL

R2-Write: 심층적 추론을 통한 개방형 글쓰기: 성찰 및 수정

R2-Write: Reflection and Revision for Open-Ended Writing with Deep Reasoning

Chenliang Li

Citations: 76

h-index: 5

Bo Zhang

Citations: 75

h-index: 4

Wanlong Liu

Citations: 354

h-index: 5

Shaopeng Lai

Citations: 152

h-index: 5

Yuning Wu

Citations: 18

h-index: 2

Xuanyu Lei

Tsinghua University

Citations: 1,387

h-index: 6

Ming Yan

Citations: 43

h-index: 2

심층적 추론과 긴 사고 과정을 통해 대규모 언어 모델이 수학과 같은 검증 가능한 분야에서 괄목할 만한 발전을 이루었지만, 이러한 효과가 글쓰기와 같은 개방형 작업에 미치는 영향은 아직 탐구되지 않았습니다. 본 논문에서는 체계적인 조사를 통해 기존의 주류 추론 모델들이 개방형 글쓰기 작업에서 제한적인 성능 향상만을 보인다는 것을 밝힙니다. 추가적인 분석 결과, 이러한 모델들은 개방형 글쓰기에서 심층적인 성찰과 수정 패턴을 부족하게 나타내며, 그 결과 수학적 추론 작업에 비해 성능 향상이 현저히 낮습니다. 이러한 한계를 해결하기 위해, 우리는 고품질의 사고 과정을 생성하고, 명시적인 성찰 및 수정 패턴을 통해 반복적인 작가-평가자 상호 작용을 활용하는 자동화된 프레임워크인 R2-Write를 제안합니다. 불필요한 성찰을 방지하기 위해, 강화 학습 과정에서 성찰의 품질을 감독하는 보상 메커니즘을 설계하여 성능과 토큰 효율성을 향상시켰습니다. 다양한 창의적인 글쓰기 및 심층 연구 벤치마크에 대한 광범위한 실험 결과는 상당한 성능 향상을 보여주며, 명시적으로 성찰 및 수정 패턴을 통합함으로써 개방형 글쓰기 작업에 대한 심층적 추론 능력을 향상시킬 수 있음을 입증합니다.

Original Abstract

While deep reasoning with long chain-of-thought has dramatically improved large language models in verifiable domains like mathematics, its effectiveness for open-ended tasks such as writing remains unexplored. In this paper, we conduct a systematic investigation revealing that existing mainstream reasoning models achieve limited gains on open-ended writing tasks. Our further analysis shows that these models lack deep reflection and revision patterns in open-ended writing, resulting in substantially smaller improvements compared to mathematical reasoning tasks. To address this limitation, we introduce R2-Write: an automated framework that synthesizes high-quality thinking trajectories enriched with explicit reflection and revision patterns through iterative writer-judge interaction. To prevent redundant reflections, we design a process reward mechanism that supervises reflection quality during reinforcement learning, improving both performance and token efficiency. Extensive experiments across multiple creative writing and deep-research benchmarks demonstrate significant improvements, validating that explicitly incorporating reflection and revision patterns unlocks deep reasoning capabilities for open-ended writing tasks.

2 Citations

0 Influential

3 Altmetric

17.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!