2601.12842v1 Jan 19, 2026 cs.AI

SCULPT: 수학적 추론을 위한 효율적인 경로를 조각하는 제약 유도 가지치기 MCTS

SCULPT: Constraint-Guided Pruned MCTS that Carves Efficient Paths for Mathematical Reasoning

Qitong Fang

Citations: 0

h-index: 0

Haotian Li

Citations: 70

h-index: 3

Xu Wang

Citations: 54

h-index: 4

자동화된 에이전트 워크플로우는 대규모 언어 모델(LLM)의 문제 해결 능력을 향상시킬 수 있지만, 일반적인 검색 전략은 확률적 탐색에 의존하여 종종 타당하지 않은 분기를 탐색하곤 한다. 이는 현재의 파이프라인이 일반적인 프롬프트나 도메인 사전 지식(domain priors)이 약한 학습된 정책으로부터 후보 단계를 샘플링하여, 연산자, 단위, 형식 전반에 걸쳐 거의 무작위적인 행보(random walks)를 초래하기 때문에 발생한다. 질서 있는 탐색을 도모하기 위해, 본 논문은 선택, 확장, 시뮬레이션, 역전파 과정에 도메인 인식 점수 산정을 통합한 몬테카를로 트리 탐색(MCTS)을 위한 제약 유도 접근법인 SCULPT를 소개한다. SCULPT는 기호적 검사(차원 일관성, 타입 호환성, 크기 타당성, 깊이 제어, 다양성)와 구조적 패턴 가이드를 결합하여 행동을 평가하고 가지치기함으로써, 탐색을 타당한 추론 경로로 유도한다. 동일한 LLM 구성에서 SCULPT는 여러 데이터셋에 걸쳐 안정적인 성능 향상을 보였으며, GPT-5.2를 이용한 추가 결과를 통해 실행기(executor) 전이성과 최신 추론 모델에서의 성능을 평가하였다. 전반적으로 도메인 인식 제약 조건은 효율성과 추론 안정성을 유지하면서 정확도를 향상시킬 수 있다.

Original Abstract

Automated agent workflows can enhance the problem-solving ability of large language models (LLMs), but common search strategies rely on stochastic exploration and often traverse implausible branches. This occurs because current pipelines sample candidate steps from generic prompts or learned policies with weak domain priors, yielding near-random walks over operators, units, and formats. To promote ordered exploration, this paper introduces SCULPT, a constraint-guided approach for Monte Carlo Tree Search (MCTS) that integrates domain-aware scoring into selection, expansion, simulation, and backpropagation. SCULPT scores and prunes actions using a combination of symbolic checks (dimensional consistency, type compatibility, magnitude sanity, depth control, and diversity) and structural pattern guidance, thereby steering the search toward plausible reasoning paths. Under matched LLM configurations, SCULPT yields stable improvements on multiple datasets; additional results with GPT-5.2 assess executor transferability and performance on frontier reasoning models. Overall, domain-aware constraints can improve accuracy while maintaining efficiency and reasoning stability.

0 Citations

0 Influential

2 Altmetric

10.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!