2602.11700v1 Feb 12, 2026 cs.LG

TabSieve: 표 형식 예측을 위한 명시적 테이블 내 증거 선택

TabSieve: Explicit In-Table Evidence Selection for Tabular Prediction

Lijun Li

Citations: 3

h-index: 1

Ziqi Miao

Citations: 20

h-index: 3

Yongyao Wang

Citations: 18

h-index: 3

Lu Yang

Citations: 53

h-index: 5

Haonan Jia

Citations: 14

h-index: 2

Wenting Yan

Citations: 32

h-index: 4

Chen Qian

Citations: 38

h-index: 3

표 형식 예측은 테이블 내 행을 퓨샷(few-shot) 증거로 활용하여 이점을 얻을 수 있지만, 기존의 표 형식 모델은 일반적으로 인스턴스 단위의 추론을 수행하며 LLM 기반 프롬프팅은 종종 불안정합니다. 모델은 관련 행을 일관되게 활용하지 못하며, 노이즈가 있는 문맥은 성능을 저하시킬 수 있습니다. 이러한 문제를 해결하기 위해, 우리는 증거 사용을 명시적이고 감사 가능하게 만드는 '선택 후 예측(select-then-predict)' 프레임워크인 TabSieve를 제안합니다. 테이블과 쿼리 행이 주어지면, TabSieve는 먼저 유용한 정보를 담고 있는 소수의 행을 증거로 선택한 다음, 선택된 증거를 조건으로 누락된 타겟을 예측합니다. 이 기능을 활성화하기 위해, 우리는 엄격한 필터링이 적용된 강력한 교사 모델을 사용하여 331개의 실제 테이블에서 고품질의 추론 궤적을 합성함으로써 TabSieve-SFT-40K를 구축합니다. 나아가, 우리는 개별적인 보상을 통해 증거 선택과 예측의 정확성을 공동으로 최적화하고, 동적 작업-이점 균형을 통해 혼합된 회귀 및 분류 학습을 안정화하는 강화 학습 기법인 TAB-GRPO를 소개합니다. 75개의 분류 테이블과 52개의 회귀 테이블로 구성된 홀드아웃 벤치마크에서 실험한 결과, TabSieve는 모든 샷 예산(shot budgets)에 걸쳐 일관되게 성능을 향상시켰으며, 두 번째로 우수한 베이스라인 대비 분류에서는 평균 2.92%, 회귀에서는 평균 4.45%의 성능 향상을 보였습니다. 추가 분석에 따르면 TabSieve는 선택된 증거에 더 많은 어텐션을 집중시켜 노이즈가 있는 문맥에 대한 강건성을 향상시킵니다.

Original Abstract

Tabular prediction can benefit from in-table rows as few-shot evidence, yet existing tabular models typically perform instance-wise inference and LLM-based prompting is often brittle. Models do not consistently leverage relevant rows, and noisy context can degrade performance. To address this challenge, we propose TabSieve, a select-then-predict framework that makes evidence usage explicit and auditable. Given a table and a query row, TabSieve first selects a small set of informative rows as evidence and then predicts the missing target conditioned on the selected evidence. To enable this capability, we construct TabSieve-SFT-40K by synthesizing high-quality reasoning trajectories from 331 real tables using a strong teacher model with strict filtering. Furthermore, we introduce TAB-GRPO, a reinforcement learning recipe that jointly optimizes evidence selection and prediction correctness with separate rewards, and stabilizes mixed regression and classification training via dynamic task-advantage balancing. Experiments on a held-out benchmark of 75 classification and 52 regression tables show that TabSieve consistently improves performance across shot budgets, with average gains of 2.92% on classification and 4.45% on regression over the second-best baseline. Further analysis indicates that TabSieve concentrates more attention on the selected evidence, which improves robustness to noisy context.

1 Citations

0 Influential

2.5 Altmetric

13.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!