2604.13521v1 Apr 15, 2026 cs.LG

C-voting: 명시적인 에너지 함수 없이 신뢰도 기반 테스트 단계 투표 방법

C-voting: Confidence-Based Test-Time Voting without Explicit Energy Functions

Yusuke Iwasawa

Citations: 10,187

h-index: 21

Yutaka Matsuo

Citations: 1,364

h-index: 11

Masanori Koyama

Citations: 29

h-index: 2

Kohei Hayashi

Citations: 68

h-index: 5

Kenji Kubo

Citations: 11

h-index: 2

Shunsuke Kamiya

Citations: 18

h-index: 1

잠재적인 순환 처리를 사용하는 신경망 모델은, 동일한 레이어가 잠재 상태에 반복적으로 적용되어 추론 작업을 수행하는 데 유망한 모델로 주목받고 있습니다. 이러한 모델의 장점 중 하나는 테스트 단계에서 모델이 추가적인 학습 없이 성능을 향상시킬 수 있는 테스트 단계 확장 기능을 제공한다는 것입니다. 계층적 추론 모델(HRM) 및 인공 쿠라모토 진동 신경망(AKOrN)과 같은 모델은 순환 단계를 늘려 더 깊은 추론을 가능하게 함으로써, 스도쿠, 미로 해결, 그리고 AGI 벤치마크와 같은 어려운 작업을 완료할 수 있습니다. 본 연구에서는 다수의 잠재 후보 경로를 가진 순환 모델을 위한 테스트 단계 확장 전략인 신뢰도 기반 투표(C-voting)를 소개합니다. C-voting은 무작위 변수를 사용하여 잠재 상태를 여러 후보로 초기화하고, 모델의 신뢰도를 반영하여 예측의 최고 1위 확률의 평균을 최대화하는 후보를 선택합니다. 또한, C-voting은 명시적인 에너지 함수를 사용하는 기반의 투표 전략보다 스도쿠-하드 문제에서 4.9% 더 높은 정확도를 달성합니다. C-voting의 중요한 장점은 명시적인 에너지 함수를 필요로 하지 않는 순환 모델에 적용 가능하다는 것입니다. 마지막으로, 무작위 초기값을 사용하는 간단한 어텐션 기반 순환 모델인 ItrSA++를 소개하고, C-voting과 결합했을 때 HRM보다 스도쿠-익스트림(95.2% vs. 55.0%) 및 미로(78.6% vs. 74.5%) 작업에서 더 뛰어난 성능을 보이는 것을 보여줍니다.

Original Abstract

Neural network models with latent recurrent processing, where identical layers are recursively applied to the latent state, have gained attention as promising models for performing reasoning tasks. A strength of such models is that they enable test-time scaling, where the models can enhance their performance in the test phase without additional training. Models such as the Hierarchical Reasoning Model (HRM) and Artificial Kuramoto Oscillatory Neurons (AKOrN) can facilitate deeper reasoning by increasing the number of recurrent steps, thereby enabling the completion of challenging tasks, including Sudoku, Maze solving, and AGI benchmarks. In this work, we introduce confidence-based voting (C-voting), a test-time scaling strategy designed for recurrent models with multiple latent candidate trajectories. Initializing the latent state with multiple candidates using random variables, C-voting selects the one maximizing the average of top-1 probabilities of the predictions, reflecting the model's confidence. Additionally, it yields 4.9% higher accuracy on Sudoku-hard than the energy-based voting strategy, which is specific to models with explicit energy functions. An essential advantage of C-voting is its applicability: it can be applied to recurrent models without requiring an explicit energy function. Finally, we introduce a simple attention-based recurrent model with randomized initial values named ItrSA++, and demonstrate that when combined with C-voting, it outperforms HRM on Sudoku-extreme (95.2% vs. 55.0%) and Maze (78.6% vs. 74.5%) tasks.

0 Citations

0 Influential

10.5 Altmetric

52.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!