2602.15861v1 Jan 26, 2026 cs.CL

CAST: 데이터 분석을 위한 안정적인 LLM 기반 텍스트 분석 구현

CAST: Achieving Stable LLM-based Text Analysis for Data Analytics

Yujia Liu

Citations: 0

h-index: 0

Zihao Li

Citations: 3

h-index: 1

Wei He

Citations: 3

h-index: 1

Rui Ding

Citations: 159

h-index: 4

Shi Han

Citations: 186

h-index: 8

Dongmei Zhang

Citations: 1,891

h-index: 25

표 형태 데이터의 텍스트 분석은 크게 두 가지 핵심 작업을 포함합니다: extit{요약}은 전체 텍스트에서 주요 주제를 추출하는 데 사용되며, extit{태깅}은 각 행을 레이블링하는 데 사용됩니다. 대규모 언어 모델(LLM)을 이러한 작업에 적용할 때의 중요한 제약 사항은 데이터 분석에서 요구하는 높은 수준의 출력 안정성을 충족하지 못한다는 점입니다. 이 문제를 해결하기 위해, 우리는 extbf{CAST} ( extbf{C}onsistency via extbf{A}lgorithmic Prompting and extbf{S}table extbf{T}hinking, 즉 알고리즘 기반 프롬프팅과 안정적인 사고를 통한 일관성 확보)라는 프레임워크를 소개합니다. CAST는 모델의 잠재적인 추론 경로를 제한하여 출력 안정성을 향상시킵니다. CAST는 (i) 알고리즘 기반 프롬프팅을 사용하여 유효한 추론 과정을 위한 절차적 틀을 제공하고, (ii) '생각하기 전에 말하기' 방식을 통해 최종 생성 전에 명시적인 중간 단계를 강제합니다. 진행 상황을 측정하기 위해, 우리는 목록 기반 요약 및 태깅에 대한 안정성 지표인 extbf{CAST-S}와 extbf{CAST-T}를 도입하고, 이러한 지표가 인간의 판단과 얼마나 일치하는지 검증했습니다. 여러 LLM 모델을 기반으로 공개된 벤치마크 데이터에 대한 실험 결과, CAST는 모든 기준 모델 중에서 가장 높은 안정성을 지속적으로 달성했으며, 안정성 점수를 최대 16.2%까지 향상시키면서 출력 품질을 유지하거나 향상시켰습니다.

Original Abstract

Text analysis of tabular data relies on two core operations: \emph{summarization} for corpus-level theme extraction and \emph{tagging} for row-level labeling. A critical limitation of employing large language models (LLMs) for these tasks is their inability to meet the high standards of output stability demanded by data analytics. To address this challenge, we introduce \textbf{CAST} (\textbf{C}onsistency via \textbf{A}lgorithmic Prompting and \textbf{S}table \textbf{T}hinking), a framework that enhances output stability by constraining the model's latent reasoning path. CAST combines (i) Algorithmic Prompting to impose a procedural scaffold over valid reasoning transitions and (ii) Thinking-before-Speaking to enforce explicit intermediate commitments before final generation. To measure progress, we introduce \textbf{CAST-S} and \textbf{CAST-T}, stability metrics for bulleted summarization and tagging, and validate their alignment with human judgments. Experiments across publicly available benchmarks on multiple LLM backbones show that CAST consistently achieves the best stability among all baselines, improving Stability Score by up to 16.2\%, while maintaining or improving output quality.

0 Citations

0 Influential

12.5 Altmetric

62.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!