2602.12164v1 Feb 12, 2026 cs.AI

Sci-CoE: 희소 지도와 기하학적 합의를 통한 과학적 추론 LLM의 공진화

Sci-CoE: Co-evolving Scientific Reasoning LLMs via Geometric Consensus with Sparse Supervision

Songtao Huang

Citations: 238

h-index: 4

Lei Bai

Citations: 8

h-index: 2

Bin Wang

Citations: 9

h-index: 2

Shiyang Feng

Citations: 139

h-index: 6

Xiaohan He

Citations: 60

h-index: 4

Bo Zhang

Citations: 14

h-index: 2

대규모 언어 모델(LLM)은 뛰어난 추론 능력을 입증했으며, 공진화 패러다임은 코드 및 수학과 같은 도메인에서 유망한 결과를 보여주었습니다. 그러나 과학적 추론 작업에서 이러한 모델들은 신뢰할 수 없는 해답 평가와 검증 전략의 제한된 다양성으로 인해 여전히 취약한 모습을 보입니다. 본 연구에서는 희소 지도 학습에서 비지도 학습으로의 전환을 통해 모델이 문제 해결자이자 검증자로서 스스로 진화할 수 있도록 하는 2단계 과학적 공진화 프레임워크인 Sci-CoE를 제안합니다. 첫 번째 단계에서 모델은 소량의 주석 데이터를 사용하여 검증자를 위한 기초적인 정답 판단 기준을 확립합니다. 두 번째 단계에서는 합의, 신뢰성, 다양성을 종합적으로 고려하는 기하학적 보상 메커니즘을 도입하여, 레이블이 없는 데이터에 대해 대규모 자가 반복을 유도합니다. 여러 일반 과학 벤치마크에 대한 실험을 통해 Sci-CoE가 복잡한 추론 능력을 향상시키고 강력한 확장성을 보여주며, 보다 견고하고 다양한 평가 시스템 구축을 촉진함을 입증했습니다. 코드는 https://github.com/InternScience/Sci-CoE 에서 확인할 수 있습니다.

Original Abstract

Large language models (LLMs) have demonstrated exceptional reasoning capabilities, and co-evolving paradigms have shown promising results in domains such as code and math. However, in scientific reasoning tasks, these models remain fragile due to unreliable solution evaluation and limited diversity in verification strategies. In this work, we propose Sci-CoE, a two-stage scientific co-evolving framework that enables models to self-evolve as both solver and verifier through a transition from sparse supervision to unsupervised learning. In the first stage, the model uses a small set of annotated data to establish fundamental correctness judgment anchors for the Verifier. In the second stage, we introduce a geometric reward mechanism that jointly considers consensus, reliability, and diversity, driving large-scale self-iteration on unlabeled data. Experiments on several general scientific benchmarks demonstrate that Sci-CoE enhances complex reasoning capabilities and exhibits strong scalability, facilitating the construction of more robust and diverse evaluation systems. Codes are available at https://github.com/InternScience/Sci-CoE.

0 Citations

0 Influential

26.4657359028 Altmetric

132.3 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!