Wanli Ouyang
Publications
From Static Spectra to Operando Infrared Dynamics: Physics Informed Flow Modeling and a Benchmark
The Solid Electrolyte Interphase (SEI) is critical to the performance of lithium-ion batteries, yet its analysis via Operando Infrared (IR) spectroscopy remains experimentally complex and expensive, which limits its accessibility for standard research facilities. To overcome this bottleneck, we formulate a novel task, Operando IR Prediction, which aims to forecast the time-resolved evolution of spectral ``fingerprints'' from a single static spectrum. To facilitate this, we introduce OpIRSpec-7K, the first large-scale operando dataset comprising 7,118 high-quality samples across 10 distinct battery systems, alongside OpIRBench, a comprehensive evaluation benchmark with carefully designed protocols. Addressing the limitations of standard spectrum, video, and sequence models in capturing voltage-driven chemical dynamics and complex composition, we propose Aligned Bi-stream Chemical Constraint (ABCC), an end-to-end physics-aware framework. It reformulates MeanFlow and introduces a novel Chemical Flow to explicitly model reaction trajectories, employs a two-stream disentanglement mechanism for solvent-SEI separation, and enforces physics and spectrum constraints such as mass conservation and peak shifts. ABCC significantly outperforms state-of-the-art static, sequential, and generative baselines. ABCC even generalizes to unseen systems and enables interpretable downstream recovery of SEI formation pathways, supporting AI-driven electrochemical discovery.
Equivariant Evidential Deep Learning for Interatomic Potentials
Uncertainty quantification (UQ) is critical for assessing the reliability of machine learning interatomic potentials (MLIPs) in molecular dynamics (MD) simulations, identifying extrapolation regimes and enabling uncertainty-aware workflows such as active learning for training dataset construction. Existing UQ approaches for MLIPs are often limited by high computational cost or suboptimal performance. Evidential deep learning (EDL) provides a theoretically grounded single-model alternative that determines both aleatoric and epistemic uncertainty in a single forward pass. However, extending evidential formulations from scalar targets to vector-valued quantities such as atomic forces introduces substantial challenges, particularly in maintaining statistical self-consistency under rotational transformations. To address this, we propose \textit{Equivariant Evidential Deep Learning for Interatomic Potentials} ($\text{e}^2$IP), a backbone-agnostic framework that models atomic forces and their uncertainty jointly by representing uncertainty as a full $3\times3$ symmetric positive definite covariance tensor that transforms equivariantly under rotations. Experiments on diverse molecular benchmarks show that $\text{e}^2$IP provides a stronger accuracy-efficiency-reliability balance than the non-equivariant evidential baseline and the widely used ensemble method. It also achieves better data efficiency through the fully equivariant architecture while retaining single-model inference efficiency.
ChemLLM: A Chemical Large Language Model
Large language models (LLMs) have made impressive progress in chemistry applications. However, the community lacks an LLM specifically designed for chemistry. The main challenges are two-fold: firstly, most chemical data and scientific knowledge are stored in structured databases, which limits the model's ability to sustain coherent dialogue when used directly. Secondly, there is an absence of objective and fair benchmark that encompass most chemistry tasks. Here, we introduce ChemLLM, a comprehensive framework that features the first LLM dedicated to chemistry. It also includes ChemData, a dataset specifically designed for instruction tuning, and ChemBench, a robust benchmark covering nine essential chemistry tasks. ChemLLM is adept at performing various tasks across chemical disciplines with fluid dialogue interaction. Notably, ChemLLM achieves results comparable to GPT-4 on the core chemical tasks and demonstrates competitive performance with LLMs of similar size in general scenarios. ChemLLM paves a new path for exploration in chemical studies, and our method of incorporating structured chemical knowledge into dialogue systems sets a new standard for developing LLMs in various scientific fields. Codes, Datasets, and Model weights are publicly accessible at https://hf.co/AI4Chem