2604.10827v1 Apr 12, 2026 cs.AI

모델의 다양성이 방법보다 추론 전략을 결정한다

Your Model Diversity, Not Method, Determines Reasoning Strategy

A. Gerogiannis

Citations: 29

h-index: 3

Chia-Hsuan Lee

Citations: 0

h-index: 0

Shi-Xiong Zhang

Citations: 108

h-index: 4

Sambit Sahu

Citations: 72

h-index: 4

Anirban Das

Citations: 109

h-index: 4

Supriyo Chakraborty

Citations: 7

h-index: 2

Moulik Choraria

Citations: 78

h-index: 4

B. Kapusuzoglu

Citations: 290

h-index: 6

Kartik Balasubramaniam

Citations: 0

h-index: 0

LLM 추론의 효율성을 높이기 위해서는 문제 해결 접근 방식을 탐색하는 것(폭)과 유망한 해결책을 개선하는 것(깊이) 사이에서 예산을 적절히 배분해야 합니다. 대부분의 방법은 이 둘 사이에서 균형을 맞추려고 하지만, 왜 특정 균형이 효과적인지는 명확하지 않으며, 단일 모델에 대한 검증은 모델 자체의 역할을 가립니다. 본 연구에서는 **최적의 전략은 모델의 다양성 프로필, 즉 문제 해결 접근 방식에 따른 확률 분포의 범위에 따라 달라지며, 이러한 특성을 파악하는 것이 모든 탐색 전략을 채택하기 전에 필수적이라고 주장합니다.** 우리는 이론적 프레임워크를 통해 추론의 불확실성을 분석하고, 트리 기반의 깊이 개선 방식이 병렬 샘플링보다 우수한 조건을 도출했습니다. Qwen-3 4B 및 Olmo-3 7B 모델 패밀리에 대한 검증 결과, 경량화된 정보만으로도 낮은 다양성을 가진 정렬 모델에 대한 깊이 기반 개선에 효과적이지만, 높은 다양성을 가진 기본 모델에서는 효과가 제한적이라는 것을 확인했습니다. 이는 높은 다양성을 가진 모델이 더 낮은 탐색 범위를 보완하기 위해 더 강력한 보상이 필요할 것이라는 가설을 뒷받침합니다.

Original Abstract

Compute scaling for LLM reasoning requires allocating budget between exploring solution approaches ($breadth$) and refining promising solutions ($depth$). Most methods implicitly trade off one for the other, yet why a given trade-off works remains unclear, and validation on a single model obscures the role of the model itself. We argue that $\textbf{the optimal strategy depends on the model's diversity profile, the spread of probability mass across solution approaches, and that this must be characterized before any exploration strategy is adopted.}$ We formalize this through a theoretical framework decomposing reasoning uncertainty and derive conditions under which tree-style depth refinement outperforms parallel sampling. We validate it on Qwen-3 4B and Olmo-3 7B families, showing that lightweight signals suffice for depth-based refinement on low-diversity aligned models while yielding limited utility for high-diversity base models, which we hypothesize require stronger compensation for lower exploration coverage.

0 Citations

0 Influential

3 Altmetric

15.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!