2505.19147 May 25, 2025 cs.AI

AI 효율성의 전환: 모델 중심에서 데이터 중심 압축으로

Shifting AI Efficiency From Model-Centric to Data-Centric Compression

Xuyang Liu

Citations: 260

h-index: 9

Zichen Wen

Citations: 945

h-index: 16

Shaobo Wang

Citations: 486

h-index: 11

Junjie Chen

Anhui Polytechnic University

Citations: 309

h-index: 7

Zhishan Tao

Citations: 37

h-index: 1

Yubo Wang

Citations: 80

h-index: 5

Xiangqi Jin

Citations: 121

h-index: 5

Chang Zou

Citations: 236

h-index: 3

Yiyu Wang

Citations: 132

h-index: 6

Chenfei Liao

Citations: 214

h-index: 8

Xu Zheng

Citations: 432

h-index: 13

Weijia Li

Citations: 877

h-index: 17

Xuming Hu

Citations: 237

h-index: 8

Conghui He

Citations: 408

h-index: 11

Linfeng Zhang

Citations: 645

h-index: 15

Honggang Chen

Citations: 1,983

h-index: 24

대규모 언어 모델(LLM)과 멀티모달 LLM(MLLM)의 발전은 역사적으로 모델 파라미터를 확장하는 것에 의존해 왔습니다. 그러나 하드웨어의 한계가 추가적인 모델 성장을 제약함에 따라, 주요 연산 병목 현상은 초장문 텍스트 문맥, 고해상도 이미지, 그리고 긴 비디오로 인해 점점 길어지는 시퀀스에 대한 셀프 어텐션의 이차적(quadratic) 비용으로 이동했습니다. 본 포지션 페이퍼에서, 우리는 효율적인 인공지능(AI)을 위한 연구의 초점이 모델 중심 압축에서 데이터 중심 압축으로 이동하고 있다고 주장합니다. 우리는 데이터 중심 압축을 모델 훈련이나 추론 중에 처리되는 데이터의 양을 직접 압축하여 AI 효율성을 향상시키는 부상하는 패러다임으로 정의합니다. 이러한 변화를 체계화하기 위해, 우리는 기존의 효율성 전략들을 위한 통합 프레임워크를 수립하고, 이것이 왜 긴 문맥 AI를 위한 중요한 패러다임 변화인지를 입증합니다. 그런 다음 데이터 중심 압축 방법론의 현황을 체계적으로 검토하고, 다양한 시나리오에 걸친 이점을 분석합니다. 마지막으로, 우리는 주요 과제와 유망한 향후 연구 방향을 개략적으로 설명합니다. 우리의 연구는 AI 효율성에 대한 새로운 관점을 제공하고, 기존의 노력들을 종합하며, 계속해서 증가하는 문맥 길이로 인해 발생하는 문제들을 해결하기 위한 혁신을 촉진하는 것을 목표로 합니다.

Original Abstract

The advancement of large language models (LLMs) and multi-modal LLMs (MLLMs) has historically relied on scaling model parameters. However, as hardware limits constrain further model growth, the primary computational bottleneck has shifted to the quadratic cost of self-attention over increasingly long sequences by ultra-long text contexts, high-resolution images, and extended videos. In this position paper, \textbf{we argue that the focus of research for efficient artificial intelligence (AI) is shifting from model-centric compression to data-centric compression}. We position data-centric compression as the emerging paradigm, which improves AI efficiency by directly compressing the volume of data processed during model training or inference. To formalize this shift, we establish a unified framework for existing efficiency strategies and demonstrate why it constitutes a crucial paradigm change for long-context AI. We then systematically review the landscape of data-centric compression methods, analyzing their benefits across diverse scenarios. Finally, we outline key challenges and promising future research directions. Our work aims to provide a novel perspective on AI efficiency, synthesize existing efforts, and catalyze innovation to address the challenges posed by ever-increasing context lengths.

37 Citations

0 Influential

12 Altmetric

97.0 Score

Original PDF

AI Analysis

Korean Summary

이 논문은 AI 효율성 연구의 패러다임을 기존의 파라미터 축소 위주의 '모델 중심 압축(Model-Centric Compression)'에서 입력 시퀀스를 줄이는 '데이터 중심 압축(Data-Centric Compression)'으로 전환해야 한다고 주장합니다. 최근 LLM과 MLLM이 처리해야 할 문맥 길이(Context Length)가 기하급수적으로 증가함에 따라, 모델 크기보다 어텐션 연산의 2차 비용($O(n^2)$)이 주요 병목이 되었기 때문입니다. 저자들은 토큰 가지치기(Pruning)와 병합(Merging)을 포함한 통합 프레임워크를 제시하며, 이러한 접근이 긴 문맥을 다루는 차세대 AI 시스템의 학습 및 추론 효율성을 획기적으로 개선할 수 있음을 입증합니다.

Key Innovations

모델 중심에서 데이터 중심 압축으로의 효율성 최적화 패러다임 전환 제안
아키텍처 설계, 모델 압축, 데이터 압축을 아우르는 통합 효율성 수식 정립
토큰 가지치기(Token Pruning) 및 토큰 병합(Token Merging)을 포함한 데이터 중심 압축 방법론의 체계적 분류
긴 문맥 처리를 위한 KV 캐시 최적화 및 메모리 감소 전략
복잡한 중요도 점수 산정 방식보다 무작위 토큰 삭제(Random Dropping)가 더 효과적일 수 있다는 실험적 발견 및 기존 어텐션 기반 점수의 위치 편향(Position Bias) 문제 지적

Learning & Inference Impact

학습 단계에서는 처리해야 할 데이터의 절대적인 양(토큰 수)을 줄임으로써 연산 비용을 낮추고, 중복되거나 정보가가 낮은 토큰을 제거하여 학습 데이터의 품질을 높여 모델의 일반화 성능을 향상시킵니다. 추론 단계에서는 입력 시퀀스 길이를 줄여 트랜스포머 모델의 어텐션 연산 복잡도를 2차적으로 감소시키고, KV 캐시 메모리 사용량을 선형적으로 줄여 긴 문맥 처리 시 지연 시간(Latency)을 최소화합니다. 또한, 많은 데이터 중심 압축 기법은 재학습 없이 추론 시에 즉시 적용(Plug-and-play)이 가능하여 호환성이 뛰어납니다.

Technical Difficulty

중급

Estimated implementation complexity based on methodology.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!