2601.15751v1 Jan 22, 2026 cs.AI

표 점진적 추론 (Tabular Incremental Inference)

Tabular Incremental Inference

Xinda Chen

Citations: 1

h-index: 1

Zhen Xing

Fudan University

Citations: 38

h-index: 3

Hanyu Zhang

Citations: 3

h-index: 1

Weimin Tan

Citations: 297

h-index: 8

Bo Yan

Citations: 31

h-index: 3

표 형식 데이터는 데이터 구조의 가장 근본적인 형태입니다. 표 분석 도구의 진화는 데이터 수집, 관리 및 처리에 있어 인류의 지속적인 발전을 반영합니다. 표 열(column)의 동적인 변화는 기술 발전, 변화하는 요구 사항, 데이터 통합 등으로 인해 발생합니다. 그러나 고정된 열을 가진 표에서 AI 모델을 학습시킨 후 추론을 수행하는 표준 과정은 동적으로 변화하는 표를 처리하는 데 적합하지 않습니다. 따라서 비지도 방식으로 이러한 표를 효율적으로 처리하기 위한 새로운 방법이 필요합니다. 본 논문에서는 훈련된 모델이 추론 단계에서 새로운 열을 통합할 수 있게 하여, 표가 동적으로 변화하는 시나리오에서 AI 모델의 실용성을 높이는 것을 목표로 하는 새로운 작업인 표 점진적 추론(Tabular Incremental Inference, TabII)을 소개합니다. 더 나아가, 우리는 이 새로운 작업이 정보 병목(information bottleneck) 이론에 기반한 최적화 문제로 구성될 수 있음을 입증합니다. 이 이론은 이상적인 표 점진적 추론 접근 방식의 핵심이 표 데이터와 표현(representation) 간의 상호정보량(mutual information)을 최소화하는 동시에 표현과 작업 레이블 간의 상호정보량을 최대화하는 데 있음을 강조합니다. 이러한 지침 하에, 우리는 외부 지식을 제공하기 위한 거대언어모델(LLM) 플레이스홀더 및 사전 훈련된 TabAdapter와, 증분된 열 속성이 제공하는 작업 관련 정보를 압축하기 위한 점진적 샘플 압축(Incremental Sample Condensation) 블록을 갖춘 TabII 방법을 설계했습니다. 8개의 공개 데이터셋에 대한 실험 결과는 TabII가 증분 속성을 효과적으로 활용하여 최고 수준(state-of-the-art)의 성능을 달성함을 보여줍니다.

Original Abstract

Tabular data is a fundamental form of data structure. The evolution of table analysis tools reflects humanity's continuous progress in data acquisition, management, and processing. The dynamic changes in table columns arise from technological advancements, changing needs, data integration, etc. However, the standard process of training AI models on tables with fixed columns and then performing inference is not suitable for handling dynamically changed tables. Therefore, new methods are needed for efficiently handling such tables in an unsupervised manner. In this paper, we introduce a new task, Tabular Incremental Inference (TabII), which aims to enable trained models to incorporate new columns during the inference stage, enhancing the practicality of AI models in scenarios where tables are dynamically changed. Furthermore, we demonstrate that this new task can be framed as an optimization problem based on the information bottleneck theory, which emphasizes that the key to an ideal tabular incremental inference approach lies in minimizing mutual information between tabular data and representation while maximizing between representation and task labels. Under this guidance, we design a TabII method with Large Language Model placeholders and Pretrained TabAdapter to provide external knowledge and Incremental Sample Condensation blocks to condense the task-relevant information given by incremental column attributes. Experimental results across eight public datasets show that TabII effectively utilizes incremental attributes, achieving state-of-the-art performance.

0 Citations

0 Influential

4 Altmetric

20.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!