2606.13477v1 Jun 11, 2026 cs.LG

SupraBench: A Benchmark for Supramolecular Chemistry

Yijun Ma
Yijun Ma
Citations: 69
h-index: 4
Zehong Wang
Zehong Wang
Citations: 486
h-index: 13
Weixiang Sun
Weixiang Sun
Citations: 18
h-index: 2
Yanfang Ye
Yanfang Ye
Citations: 70
h-index: 5
Chuxu Zhang
Chuxu Zhang
Citations: 2,869
h-index: 28
Tianyi Ma
Tianyi Ma
Citations: 350
h-index: 12
Ziming Li
Ziming Li
Citations: 491
h-index: 13
Connor R. Schmidt
Connor R. Schmidt
Citations: 19
h-index: 3
Matthew Webber
Matthew Webber
Citations: 12
h-index: 2

Supramolecular chemistry, which includes the study of non-covalent host-guest assemblies, has advanced various applications. However, designing host-guest systems remains time-consuming, requiring days of dry-lab verification per candidate pair. Although LLMs have emerged as a fast alternative with strong performance on molecular binding tasks, no benchmark currently systematically evaluates LLMs for host-guest reasoning across fundamental supramolecular chemistry tasks, e.g., binding affinity prediction. To this end, we collaborate with domain experts to release the first Supramolecular Benchmark, called SupraBench, to evaluate LLMs in chemistry reasoning. Specifically, we design four fundamental tasks, i.e., binding affinity prediction, top-binder selection, solvent identification, and host-guest description, plus an auxiliary vision-based task for molecular identification. We also release SupraPMC, a curated 16M-token corpus of Supramolecular chemistry articles distilled from Europe PMC, to support the adaptation to the supramolecular domain. We benchmark a broad range of open and proprietary LLMs and find that LLMs leave substantial headroom across all tasks. Domain adaptation pretraining over SupraPMC transfers cleanly to in-distribution regression but trades off against strict letter-format output. Moreover, the difficulty profile differs sharply across task families, revealing distinct failure modes that indicate specific gaps in current supramolecular chemistry reasoning. Our source codes and benchmark datasets are available at https://github.com/Tianyi-Billy-Ma/SupraBench.

0 Citations
0 Influential
37.4657359028 Altmetric
187.3 Score
Original PDF
1

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

Log in to request an AI analysis.

댓글

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!