2601.15751v2 Jan 22, 2026 cs.AI

표형 데이터의 점진적 추론

Tabular Incremental Inference

Xinda Chen

Citations: 1

h-index: 1

Zhen Xing

Fudan University

Citations: 38

h-index: 3

Hanyu Zhang

Citations: 3

h-index: 1

Weimin Tan

Citations: 297

h-index: 8

Bo Yan

Citations: 31

h-index: 3

표형 데이터는 기본적인 데이터 구조의 한 형태입니다. 표 분석 도구의 발전은 인류의 데이터 획득, 관리 및 처리 기술 발전과정을 반영합니다. 표의 열(column)은 기술 발전, 변화하는 요구사항, 데이터 통합 등으로 인해 동적으로 변화합니다. 그러나, 일반적으로 AI 모델은 고정된 열을 가진 표에 대해 학습을 수행하고 추론을 진행하는데, 이는 동적으로 변화하는 표를 처리하는 데 적합하지 않습니다. 따라서, 이러한 표를 효율적으로 처리할 수 있는 새로운 방법이 필요합니다. 본 논문에서는, 학습된 모델이 추론 단계에서 새로운 열을 추가할 수 있도록 하는 새로운 과제인 '표형 점진적 추론 (TabII)'을 소개합니다. 이는 표가 동적으로 변화하는 환경에서 AI 모델의 실용성을 향상시키는 것을 목표로 합니다. 또한, 본 연구에서는 이 새로운 과제를 정보 병목 이론에 기반한 최적화 문제로 정의하며, 이상적인 표형 점진적 추론 방법은 표형 데이터와 표현 사이의 상호 정보량을 최소화하고, 표현과 작업 레이블 사이의 상호 정보량을 최대화하는 데 있다는 점을 강조합니다. 이러한 지침에 따라, 우리는 대규모 언어 모델 placeholder, 사전 학습된 TabAdapter를 활용하여 외부 지식을 제공하고, 점진적으로 추가되는 열 속성에서 제공되는 작업 관련 정보를 응축하기 위한 점진적 샘플 응축 블록을 설계했습니다. 8개의 공개 데이터셋에 대한 실험 결과는 TabII가 점진적인 속성을 효과적으로 활용하여 최첨단 성능을 달성함을 보여줍니다.

Original Abstract

Tabular data is a fundamental form of data structure. The evolution of table analysis tools reflects humanity's continuous progress in data acquisition, management, and processing. The dynamic changes in table columns arise from technological advancements, changing needs, data integration, etc. However, the standard process of training AI models on tables with fixed columns and then performing inference is not suitable for handling dynamically changed tables. Therefore, new methods are needed for efficiently handling such tables in an unsupervised manner. In this paper, we introduce a new task, Tabular Incremental Inference (TabII), which aims to enable trained models to incorporate new columns during the inference stage, enhancing the practicality of AI models in scenarios where tables are dynamically changed. Furthermore, we demonstrate that this new task can be framed as an optimization problem based on the information bottleneck theory, which emphasizes that the key to an ideal tabular incremental inference approach lies in minimizing mutual information between tabular data and representation while maximizing between representation and task labels. Under this guidance, we design a TabII method with Large Language Model placeholders and Pretrained TabAdapter to provide external knowledge and Incremental Sample Condensation blocks to condense the task-relevant information given by incremental column attributes. Experimental results across eight public datasets show that TabII effectively utilizes incremental attributes, achieving state-of-the-art performance.

0 Citations

0 Influential

4 Altmetric

20.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!