2604.10135v1 Apr 11, 2026 cs.CL

문장 단위로 생각하기: 명시적인 문장 경계가 언어 모델의 성능을 향상시킨다

Think in Sentences: Explicit Sentence Boundaries Enhance Language Model's Capabilities

Zhicheng Liu

Citations: 65

h-index: 4

Yongyuan Li

Citations: 24

h-index: 1

Yang Xu

Citations: 31

h-index: 1

연구자들은 다양한 방법으로, 특히 문맥 내에 더미 토큰을 삽입하여 대규모 언어 모델(LLM)의 성능을 향상시키고자 노력해 왔습니다. 그러나 기존 연구들은 더미 토큰 자체에만 초점을 맞추고, 자연어의 고유한 문장 수준 구조를 활용하지 못한다는 한계가 있습니다. 이는 중요한 간과점입니다. 왜냐하면 LLM은 인간이 생성한 텍스트에 노출되면서 언어 능력을 습득하며, 이러한 텍스트는 본질적으로 문장 수준으로 구조화되어 있기 때문입니다. 이러한 간극을 인식하고, 우리는 LLM 입력에 문장 경계에 구분 기호를 삽입하는 방법을 제안합니다. 이 방법은 더미 토큰을 문맥에 통합할 뿐만 아니라, LLM이 추론 과정에서 문장 단위로 처리하도록 돕습니다. 우리는 7B 모델을 사용하여 600B Deepseek-V3 모델에 대해 두 가지 구체적인 방법을 실험했습니다: (1) 문맥 학습 (in-context learning) 및 (2) 지도 미세 조정 (supervised fine-tuning). 실험 결과, 다양한 작업에서 일관된 성능 향상이 나타났으며, 특히 GSM8k에서 최대 7.7%, DROP에서 12.5%의 상당한 개선이 있었습니다. 또한, 미세 조정된 LLM은 내부 표현을 통해 문장 인지 능력을 갖추고 있음을 확인했습니다. 본 연구는 LLM의 성능을 향상시키는 간단하면서도 효과적인 기술을 제시하며, 인지적 영감을 받은 LLM 향상 패러다임에 대한 유망한 방향을 제시합니다.

Original Abstract

Researchers have explored different ways to improve large language models (LLMs)' capabilities via dummy token insertion in contexts. However, existing works focus solely on the dummy tokens themselves, but fail to leverage the inherent sentence-level structure of natural language. This is a critical oversight, as LLMs acquire linguistic capabilities through exposure to human-generated texts, which are inherently structured at the sentence level. Motivated by this gap, we propose an approach that inserts delimiters at sentence boundaries in LLM inputs, which not only integrates dummy tokens into the context, but also facilitates LLMs with sentence-by-sentence processing behavior during reasoning. Two concrete methods: (1). In-context learning and (2). Supervised fine-tuning are experimented using 7B models to 600B Deepseek-V3. Our results demonstrate consistent improvements across various tasks, with notable gains of up to 7.7\% on GSM8k and 12.5\% on DROP. Furthermore, the fine-tuned LLMs can incorporate sentence awareness evidenced by their internal representations. Our work establishes a simple yet effective technique for enhancing LLM's capabilities, offering promising directions for cognitive-inspired LLM enhancement paradigm.

1 Citations

0 Influential

2 Altmetric

11.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!