2601.21708v1 Jan 29, 2026 cs.AI

FBS: 트랜스포머 내 네이티브 병렬 읽기 모델링

FBS: Modeling Native Parallel Reading inside a Transformer

Citations: 21

h-index: 3

대규모 언어 모델(LLM)은 다양한 작업에서 뛰어난 성능을 보이지만, 추론 과정은 여전히 엄격한 토큰 단위의 자기회귀 방식이 주를 이루고 있습니다. 기존의 가속화 방법들은 대부분 이러한 파이프라인을 단순히 보완하는 수준에 그쳐, 내용 적응형 예지(foresight), 청크 구조를 고려한 연산 할당, 미리보기/훑어보기를 위한 학습-테스트 일관성과 같은 인간 독해의 핵심 요소를 놓치고 있습니다. 이에 본 논문에서는 파라포비아 어텐션 윈도우(PAW), 청크 헤드(CH), 스킵 게이트(SG)를 통해 트랜스포머에 인과적이고 학습 가능한 루프를 주입하는 Fovea-Block-Skip Transformer(FBS)를 제안합니다. 다양한 벤치마크 실험 결과, FBS는 파라미터 증가 없이 품질-효율성 트레이드오프를 개선하였으며, 소거 실험(ablation study)을 통해 세 가지 모듈이 상호 보완적으로 작용함을 입증했습니다.

Original Abstract

Large language models (LLMs) excel across many tasks, yet inference is still dominated by strictly token-by-token autoregression. Existing acceleration methods largely patch this pipeline and miss core human-reading ingredients: content-adaptive foresight, chunk-structure-aware compute allocation, and train--test consistency for preview/skimming. We propose the \textbf{Fovea-Block-Skip Transformer} (FBS), which injects a causal, trainable loop into Transformers via Parafovea-Attention Window (PAW), Chunk-Head (CH), and Skip-Gate (SG). Across diverse benchmarks, FBS improves the quality-efficiency trade-off without increasing parameters, and ablations show the three modules are complementary.

8 Citations

0 Influential

1.5 Altmetric

15.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!