2602.14089v1 Feb 15, 2026 cs.DB

TabTracer: 대규모 언어 모델을 활용한 복잡한 표 추론을 위한 몬테카를로 트리 탐색

TabTracer: Monte Carlo Tree Search for Complex Table Reasoning with Large Language Models

Rui Mao

Citations: 129

h-index: 4

Meihui Zhang

Citations: 20

h-index: 3

Zhi-Quan Luo

Citations: 114

h-index: 2

Zhaojing Luo

Citations: 512

h-index: 13

대규모 언어 모델(LLM)은 자연어 표 추론 분야에서 강력한 도구로 부상했으며, 이 분야에는 크게 두 가지 방법론이 존재합니다. 프롬프트 기반 접근 방식은 언어만을 사용하여 추론하거나 단일 단계로 프로그램을 생성하지만, 단계별 검증이 이루어지지 않습니다. 에이전트 기반 접근 방식은 도구를 사용하여 폐쇄 루프 방식으로 작동하지만, 검증은 종종 국소적이며, 백트래킹이 제한적이어서 오류가 전파되고 비용이 증가하는 경향이 있습니다. 또한, 이러한 방식은 일반적으로 조합적으로 중복되는 체인 또는 빔 스타일의 경로를 사용하며, 이는 높은 토큰 비용으로 이어집니다. 본 논문에서는 TabTracer라는 에이전트 기반 프레임워크를 제안합니다. TabTracer는 중간 표 상태를 통해 여러 단계의 도구 호출을 조정하며, 검증 및 롤백을 위한 명시적인 상태 추적 기능을 제공합니다. 첫째, TabTracer는 타입이 지정된 연산과 경량 숫자 및 형식 검사를 통해 단계별 검증을 강화하여 신뢰할 수 있는 보상을 제공하고 환각 현상을 억제합니다. 둘째, 실행 피드백 몬테카를로 트리 탐색은 후보 표 상태의 탐색 트리를 유지하고, 백프로파게이션된 반사 점수를 사용하여 UCB1 선택을 안내하고 버전화된 스냅샷을 통해 롤백을 수행합니다. 셋째, 예산 기반 가지치기, 중복 제거 및 상태 해싱과 함께 단조성 게이트를 사용하여 불필요한 연산을 줄여 토큰 비용을 절감합니다. TabFact, WikiTQ 및 CRT 데이터 세트에 대한 종합적인 평가 결과, TabTracer는 최첨단 모델보다 최대 6.7%의 정확도 향상을 보였으며, 토큰 소비량을 59~84%까지 줄였습니다.

Original Abstract

Large language models (LLMs) have emerged as powerful tools for natural language table reasoning, where there are two main categories of methods. Prompt-based approaches rely on language-only inference or one-pass program generation without step-level verification. Agent-based approaches use tools in a closed loop, but verification is often local and backtracking is limited, allowing errors to propagate and increasing cost. Moreover, they rely on chain- or beam-style trajectories that are typically combinatorially redundant, leading to high token costs. In this paper, we propose TabTracer, an agentic framework that coordinates multi-step tool calls over intermediate table states, with explicit state tracking for verification and rollback. First, it enforces step-level verification with typed operations and lightweight numeric and format checks to provide reliable rewards and suppress hallucinations. Second, execution-feedback Monte Carlo Tree Search maintains a search tree of candidate table states and uses backpropagated reflection scores to guide UCB1 selection and rollback via versioned snapshots. Third, it reduces redundancy with budget-aware pruning, deduplication, and state hashing with a monotonicity gate to cut token cost. Comprehensive evaluation on TabFact, WikiTQ, and CRT datasets shows that TabTracer outperforms state-of-the-art baselines by up to 6.7% in accuracy while reducing token consumption by 59--84%.

1 Citations

0 Influential

6.5 Altmetric

33.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!