2604.14847v1 Apr 16, 2026 cs.AI

TrigReason: 트리거 기반의 소형 및 대형 추론 모델 간 협업

TrigReason: Trigger-Based Collaboration between Small and Large Reasoning Models

Cam-Tu Nguyen

Citations: 121

h-index: 6

Z. Li

Citations: 2,245

h-index: 24

Hai Zhao

Citations: 266

h-index: 8

Yi Zhao

Citations: 12

h-index: 2

Yajuan Peng

Citations: 12

h-index: 2

Xiaoliang Wang

Citations: 7

h-index: 2

Xiaoming Fu

Citations: 6

h-index: 2

대규모 추론 모델(LRM)은 연쇄적인 사고 과정을 통해 복잡한 작업에서 뛰어난 성능을 보이지만, 자기 회귀적 추론으로 인해 높은 추론 지연 시간을 겪습니다. 최근 연구에서는 추론 속도를 높이기 위해 소형 추론 모델(SRM)을 활용하는 방안이 모색되고 있습니다. 본 논문에서는 SRM의 성능 한계를 체계적으로 분석하고, 다음과 같은 세 가지 일반적인 추론 위험 요소를 식별했습니다. (1) 경로 발산: SRM이 초기 계획을 수립하는 전략적 능력이 부족하여 추론이 가장 가능성 높은 경로에서 벗어나는 현상. (2) 인지 과부하: SRM이 특히 어려운 단계를 해결하지 못하는 현상. (3) 회복 불능: SRM이 강력한 자기 성찰 및 오류 수정 메커니즘을 갖추지 못하여 문제 발생 시 회복이 어려운 현상. 이러한 문제점을 해결하기 위해, 우리는 지속적인 검사 대신 선택적인 개입을 통해 추론을 협력하는 트리거 기반 프레임워크인 TrigReason을 제안합니다. TrigReason은 대부분의 추론을 SRM에 위임하고, 초기 전략 계획 시(전략 활성화 트리거), 과도한 자신감을 감지했을 때(인지 부담 해소 트리거), 또는 추론이 비생산적인 루프에 빠졌을 때(개입 요청 트리거)에만 LRM의 개입을 활성화합니다. AIME24, AIME25 및 GPQA-D 데이터셋에 대한 평가 결과, TrigReason은 전체 LRM 및 SpecReason과 동등한 정확도를 보이면서 SRM에 1.70배에서 4.79배 더 많은 추론 단계를 위임합니다. 또한, 엣지-클라우드 환경에서 TrigReason은 지연 시간을 43.9% 줄이고 API 비용을 73.3% 절감합니다. 저희 코드는 다음 링크에서 확인하실 수 있습니다: [https://github.com/QQQ-yi/TrigReason](https://github.com/QQQ-yi/TrigReason)

Original Abstract

Large Reasoning Models (LRMs) achieve strong performance on complex tasks through extended chains of thought but suffer from high inference latency due to autoregressive reasoning. Recent work explores using Small Reasoning Models (SRMs) to accelerate LRM inference. In this paper, we systematically characterize the capability boundaries of SRMs and identify three common types of reasoning risks: (1) path divergence, where SRMs lack the strategic ability to construct an initial plan, causing reasoning to deviate from the most probable path; (2) cognitive overload, where SRMs fail to solve particularly difficult steps; and (3) recovery inability, where SRMs lack robust self-reflection and error correction mechanisms. To address these challenges, we propose TrigReason, a trigger-based collaborative reasoning framework that replaces continuous polling with selective intervention. TrigReason delegates most reasoning to the SRM and activates LRM intervention only when necessary-during initial strategic planning (strategic priming trigger), upon detecting extraordinary overconfidence (cognitive offload trigger), or when reasoning falls into unproductive loops (intervention request trigger). The evaluation results on AIME24, AIME25, and GPQA-D indicate that TrigReason matches the accuracy of full LRMs and SpecReason, while offloading 1.70x - 4.79x more reasoning steps to SRMs. Under edge-cloud conditions, TrigReason reduces latency by 43.9\% and API cost by 73.3\%. Our code is available at \href{https://github.com/QQQ-yi/TrigReason}{https://github.com/QQQ-yi/TrigReason}

0 Citations

0 Influential

38.931471805599 Altmetric

194.7 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!