2606.10587v1 Jun 09, 2026 cs.LG

Towards Diverse Scientific Hypothesis Search with Large Language Models

Kazem Meidani
Kazem Meidani
Citations: 1,148
h-index: 13
P. Shojaee
P. Shojaee
Citations: 973
h-index: 11
Kunyang Sun
Kunyang Sun
Citations: 370
h-index: 3
Chandan K. Reddy
Chandan K. Reddy
Citations: 15
h-index: 2
Haorui Wang
Haorui Wang
Citations: 261
h-index: 9
Jos'e Miguel Hern'andez-Lobato
Jos'e Miguel Hern'andez-Lobato
Citations: 355
h-index: 11
T. Head-Gordon
T. Head-Gordon
Citations: 98
h-index: 6
Jiajun He
Jiajun He
University of Cambridge
Citations: 182
h-index: 8
Chao Zhang
Chao Zhang
Citations: 146
h-index: 4
Yuanqi Du
Yuanqi Du
Citations: 172
h-index: 4

Large language models (LLMs) are on the rise for accelerating scientific discovery, most recently in advanced tasks such as generating valid scientific hypotheses. Yet in many discovery settings, the goal is not to identify a single best hypothesis since validation can be noisy and expensive, and scientists benefit from a set of high-quality alternative hypotheses that hedge against downstream uncertainty for the best solutions. Nevertheless, commonly used evolutionary search recipes tend to prioritize optimization over exploration in hypothesis generation, and the resulting selection pressure during the search process leads to diversity collapse. Motivated by these limitations, we formulate hypothesis search as a sampling problem, where the objective is to efficiently produce diverse, high-quality hypotheses under a fixed validation budget. Building on this perspective, we propose \ours, an evolutionary framework inspired by the classical parallel tempering algorithm that searches hypotheses at multiple temperature levels and enables principled information exchange across temperatures to improve exploration without disrupting convergence. Across domains including molecular discovery, equation discovery, and algorithm discovery, our approach consistently improves both hypothesis quality and diversity under the same validation budget, and produces candidates that remain robust under more expensive downstream computational validations.

0 Citations
0 Influential
6.5 Altmetric
32.5 Score
Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

Log in to request an AI analysis.

댓글

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!