Journal of Chemical Information and Modeling Vol. 65 No. 18

Fast, Comprehensive, and User Customizable Macromolecule Interface Analysis with FACE2FACE

Patrizio Di Micco*
,
Mario Incarnato
,
Gianmarco Pascarella
,
Allegra Via
, and
Veronica Morea*

Journal of Chemical Information and Modeling 2025, 65, 18, 9371-9377 (Application Note)

Publication Date (Web):July 30, 2025

ABSTRACT

Precedent Finder: Locating Pareto-Optimal Reactions

Christoph A. Bauer*
,
Thierry Kogej
,
Samuel Genheden
, and
Per-Ola Norrby

Journal of Chemical Information and Modeling 2025, 65, 18, 9378-9382 (Application Note)

Publication Date (Web):September 10, 2025

ABSTRACT

In Search of Beautiful Molecules: A Perspective on Generative Modeling for Drug Design

Remco L. van den Broek
,
Shivam Patel
,
Gerard J. P. van Westen
,
Willem Jespers*
, and
Woody Sherman*

Journal of Chemical Information and Modeling 2025, 65, 18, 9383-9397 (Perspective)

Publication Date (Web):September 2, 2025

ABSTRACT

Generative modeling with artificial intelligence (GenAI) offers an emerging approach to discover novel, efficacious, and safe drugs by enabling the systematic exploration of chemical space and to design molecules that are synthesizable while also having desirable drug properties. However, despite rapid progress in other industries, GenAI has yet to demonstrate clear and consistent value in prospective drug discovery applications. In this Perspective, we argue that the ultimate goal of generative chemistry is not just to generate “new” or “interesting” molecules, but to generate “beautiful” molecules─those that are therapeutically aligned with the program objectives and bring value beyond traditional approaches. We focus on five essential considerations for the successful applications of GenAI for drug discovery (GADD): 1) chemical synthesizability (accounting for time/cost constraints); 2) favorable ADMET (absorption, distribution, metabolism, excretion, and toxicity) properties; 3) desirable target-specific binding to modulate the biological mechanism of interest; 4) the construction of appropriate multiparameter optimization (MPO) functions to drive the GenAI toward the project objectives; and 5) human feedback from experienced drug hunters. Interestingly, defining the beauty of a molecule in a drug discovery program is not always obvious, being context-dependent as data emerge and priorities shift, making the role of expert human input indispensable. While MPO frameworks using complex desirability functions or Pareto optimization can help operationalize multifaceted project objectives, they cannot yet fully capture the nuanced judgment of experienced drug hunters. Reinforcement learning with human feedback (RLHF) offers a path to guide the GenAI toward therapeutically aligned molecules, just as RLHF played a pivotal role in training large language models (LLMs) like ChatGPT, especially in aligning the model’s behavior with human expectations. While not responsible for the model’s base knowledge, RLHF is essential in shaping how the model responds. In addition to RLHF, future progress in GADD will depend on better property prediction models and explainable systems that provide insights to expert drug hunters. “Beauty is in the eyes of the beholder”─for drug discovery, beauty is judged by experienced drug hunters and clinical success.

Practically Significant Method Comparison Protocols for Machine Learning in Small Molecule Drug Discovery

Jeremy R. Ash
,
Cas Wognum*
,
Raquel Rodríguez-Pérez
,
Matteo Aldeghi
,
Alan C. Cheng
,
Djork-Arné Clevert
,
Ola Engkvist
,
Cheng Fang
,
Daniel J. Price
,
Jacqueline M. Hughes-Oliver
, and
W. Patrick Walters

Journal of Chemical Information and Modeling 2025, 65, 18, 9398-9411 (Perspective)

Publication Date (Web):September 11, 2025

ABSTRACT

3D Spatial Learning for Adsorption Energy Prediction in Multi-Temporal Solution Systems: The MTSS Data Set and a GCN-Based Network

Lanqi Li
,
Rui Luo
,
Xiaolu Chen
,
Huapeng Wei
,
Wenming Zhang
,
Qiang Lu
,
Weiming Dong
,
Jianmei Lu*
,
Bing Zhang*
, and
Fan Tang*

Journal of Chemical Information and Modeling 2025, 65, 18, 9412-9424 (Machine Learning and Deep Learning)

Publication Date (Web):September 3, 2025

ABSTRACT

PhyCysID: Plant Cystatin Protein Prediction by an Artificial Intelligence Approach

Sadaf Aqil
,
Isabel C. Cadavid
,
Nureyev F. Rodrigues
,
Natalia Balbinott
,
Geancarlo Zanatta
, and
Rogerio Margis*

Journal of Chemical Information and Modeling 2025, 65, 18, 9425-9434 (Machine Learning and Deep Learning)

Publication Date (Web):September 4, 2025

ABSTRACT

Leveraging Language Model, Crystal Structure Prediction and First-Principles Calculation for Material Design

Lei Zhang*
,
Ben Ni
,
Kaiyang Xu
,
Yiru Huang
,
Qingfang Li
, and
Lifeng Liu*

Journal of Chemical Information and Modeling 2025, 65, 18, 9435-9442 (Machine Learning and Deep Learning)

Publication Date (Web):September 10, 2025

ABSTRACT

KGG: Knowledge-Guided Graph Self-Supervised Learning to Enhance Molecular Property Predictions

Van-Thinh To
,
Phuoc-Chung Van Nguyen
,
Gia-Bao Truong
,
Tuyet-Minh Phan
,
Tieu-Long Phan*
,
Rolf Fagerberg
,
Peter F. Stadler
, and
Tuyen Ngoc Truong*

Journal of Chemical Information and Modeling 2025, 65, 18, 9443-9458 (Machine Learning and Deep Learning)

Publication Date (Web):September 7, 2025

ABSTRACT

Evolutionary Constraints Guide AlphaFold2 in Predicting Alternative Conformations and Inform Rational Mutation Design

Valerio Piomponi*
,
Alberto Cazzaniga
, and
Francesca Cuturello*

Journal of Chemical Information and Modeling 2025, 65, 18, 9459-9468 (Machine Learning and Deep Learning)

Publication Date (Web):September 3, 2025

ABSTRACT

Physicochemical Property Models for Poly- and Perfluorinated Alkyl Substances and Other Chemical Classes

Todd M. Martin*
,
Landon R. Batts
,
Nathaniel Charest
,
Charles N. Lowe
,
Gabriel Sinclair
, and
Antony J. Williams

Journal of Chemical Information and Modeling 2025, 65, 18, 9469-9482 (Machine Learning and Deep Learning)

Publication Date (Web):September 8, 2025

ABSTRACT

Transferable Neural Network Potentials and Condensed Phase Properties

Anna Katharina Picha
,
Marcus Wieder
, and
Stefan Boresch*

Journal of Chemical Information and Modeling 2025, 65, 18, 9483-9496 (Machine Learning and Deep Learning)

Publication Date (Web):September 11, 2025

ABSTRACT

Predicting HOMO–LUMO Gaps Using Hartree–Fock Calculated Data and Machine Learning Models

Md Mehedi Hasan
,
Omid Tarkhaneh
,
Sharene D. Bungay
,
Raymond A. Poirier
, and
Shahidul M. Islam*

Journal of Chemical Information and Modeling 2025, 65, 18, 9497-9515 (Machine Learning and Deep Learning)

Publication Date (Web):September 10, 2025

ABSTRACT

The calculation of the highest occupied molecular orbital–lowest unoccupied molecular orbital (HOMO–LUMO) gap for chemical molecules is computationally intensive using quantum mechanics (QM) methods, while experimental determination is often costly and time-consuming. Machine Learning (ML) offers a cost-effective and rapid alternative, enabling efficient predictions of HOMO–LUMO gap values across large data sets without the need for extensive QM computations or experiments. ML models facilitate the screening of diverse molecules, providing valuable insights into complex chemical spaces and integrating seamlessly into high-throughput workflows to prioritize candidates for experimental validation. In this study, we leveraged a data set of HOMO–LUMO gap values for small molecules obtained through Hartree–Fock (HF) calculations and developed ML models to predict HOMO–LUMO energy gaps for organic molecules. Molecular descriptors generated from Simplified Molecular Input Line Entry System (SMILES) representations using RDKit were used as input features to train various regression-based ML models. The data set included 46,717 small molecules with carbon chain number ranging from 1 to 8. Among the tested models, LightGBM regressor, Bidirectional LSTM, CatBoost regressor, and Multilayer Perceptron (MLP) achieved mean absolute error (MAE) values below 0.25 eV. Further improvement was achieved by creating a weighted ensemble model combining the LightGBM regressor, Bidirectional LSTM, and MLP, resulting in a prediction accuracy with an MAE of 0.1660 eV. This ensemble model outperformed others across various data sets, with the LightGBM regressor showing better performance for predicting the HOMO–LUMO gap of saturated linear molecules. SHAP analysis was conducted which identified 20 molecular descriptors critical for accurate predictions. Additionally, the models were empirically adapted to estimate experimental HOMO–LUMO gap values for both small and large molecules (up to carbon number 50), demonstrating their versatility and practical applicability.

Can Reasoning Power Significantly Improve the Knowledge of Large Language Models for Chemistry?─Based on Conversations with LLMs

Dong-Xu Cui
,
Shi-Yu Long
,
Yi-Xuan Tang
,
Yue Zhao
, and
Qiao Li*

Journal of Chemical Information and Modeling 2025, 65, 18, 9516-9527 (Chemical Information)

Publication Date (Web):August 25, 2025

ABSTRACT

Conformational Dynamics of hAgo2 Silencing: Decoding Functional Divergence across Human Argonaute Paralogs

Antonella Paladino*
,
Andrea Catte
,
Jorge Franco
,
Elisabetta Moroni
, and
Silvia Rinaldi*

Journal of Chemical Information and Modeling 2025, 65, 18, 9528-9540 (Computational Chemistry)

Publication Date (Web):June 5, 2025

ABSTRACT

Data-Driven Generation of Conformational Ensembles and Ternary Complexes for PROTAC and Other Chimera Systems

Fabio Montisci
,
Laura Friggeri
,
Kepa K. Burusco-Goni
,
Patrick McCabe
,
Bojana Popovic
, and
Jason C. Cole*

Journal of Chemical Information and Modeling 2025, 65, 18, 9541-9556 (Computational Chemistry)

Publication Date (Web):September 5, 2025

ABSTRACT

Decoding BCL6 Inhibitors: Computational Insights into the Impact of Water Networks on Potency

Daniella E. Hares
,
Andrea Scarpino
,
Michael S. Bodnarchuk*
, and
Swen Hoelder*

Journal of Chemical Information and Modeling 2025, 65, 18, 9557-9565 (Computational Chemistry)

Publication Date (Web):August 28, 2025

ABSTRACT

From AI-Driven Sequence Generation to Molecular Simulation: A Comprehensive Framework for Antimicrobial Peptide Discovery

Chunsuo Tian
,
Yuelei Hao
,
Haohao Fu*
,
Xueguang Shao*
, and
Wensheng Cai*

Journal of Chemical Information and Modeling 2025, 65, 18, 9566-9575 (Computational Chemistry)

Publication Date (Web):August 29, 2025

ABSTRACT

Chemical Space Exploration with Artificial “Mindless” Molecules

Thomas Gasevic
,
Marcel Müller
,
Jonathan Schöps
,
Stephanie Lanius
,
Jan Hermann
,
Stefan Grimme
, and
Andreas Hansen*

Journal of Chemical Information and Modeling 2025, 65, 18, 9576-9587 (Computational Chemistry)

Publication Date (Web):September 2, 2025

ABSTRACT

Exhaustive DFTB Parameterization and Its Implementation for the Exploration of Ag Nanostructures + H₂O Complexes

Paria Fallahi
and
Hossein Farrokhpour*

Journal of Chemical Information and Modeling 2025, 65, 18, 9588-9609 (Computational Chemistry)

Publication Date (Web):August 28, 2025

ABSTRACT

Characterization of Protein–Ligand Chalcogen Bonds: Insights from Database Survey and Quantum Mechanics Calculations

Wenhao Cai
,
Ziyue Li
,
Wangchen Zhou
,
Hongli Chen
,
Yungen Xu*
,
Qihua Zhu*
, and
Yi Zou*

Journal of Chemical Information and Modeling 2025, 65, 18, 9610-9622 (Computational Chemistry)

Publication Date (Web):August 29, 2025

ABSTRACT

A Machine Learning Model for the Proteome-Wide Prediction of Lipid-Interacting Proteins

Jonathan Chiu-Chun Chou
,
Poulami Chatterjee
,
Cassandra M. Decosto
, and
Laura M. K. Dassama*

Journal of Chemical Information and Modeling 2025, 65, 18, 9623-9638 (Computational Biochemistry)

Publication Date (Web):September 4, 2025

ABSTRACT

E76K Mutation Promotes SHP2 Activation by Rewiring Allosteric Networks That Drives Conformational Transitions

Derui Zhao
,
Mengting Liu
,
Hui Duan
,
Junyao Zhu
,
Liquan Yang*
, and
Peng Sang*

Journal of Chemical Information and Modeling 2025, 65, 18, 9639-9653 (Computational Biochemistry)

Publication Date (Web):September 8, 2025

ABSTRACT

Precision in Predicting Protein–Nucleic Acid Complexes: Establishing a Benchmark Data Set and Comparative Metrics

Huizi Cui
,
Yuxuan Wang
,
Yu Fu
,
Xiangyu Yu
,
Wannan Li
,
Feng Lin*
, and
Weiwei Han*

Journal of Chemical Information and Modeling 2025, 65, 18, 9654-9671 (Computational Biochemistry)

Publication Date (Web):September 11, 2025

ABSTRACT

PAZ Domain Pivoting is the Rate-Limiting Step for Target DNA Recognition in the Middle Region of Thermus thermophilus Argonaute

Jinchu Liu
,
Kun Xi
, and
Lizhe Zhu*

Journal of Chemical Information and Modeling 2025, 65, 18, 9672-9683 (Computational Biochemistry)

Publication Date (Web):September 10, 2025

ABSTRACT

Molecular Dynamics Simulations Reveal Conformational Determinants of the Dynamic Association between α-Synuclein and Membranes

Jiahui Huang
and
Cong Guo*

Journal of Chemical Information and Modeling 2025, 65, 18, 9684-9696 (Computational Biochemistry)

Publication Date (Web):August 28, 2025

ABSTRACT

Prediction of Activity and Selectivity Profiles of Sigma Receptor Ligands Using Machine Learning Approaches

Lisa Lombardo
,
Verena Battisti
,
Thierry Langer
,
Rosaria Gitto
, and
Laura De Luca*

Journal of Chemical Information and Modeling 2025, 65, 18, 9697-9712 (Pharmaceutical Modeling)

Publication Date (Web):September 1, 2025

ABSTRACT

Virtual Compound Screening for Discovery of Dopamine D1 Receptor Biased Allosteric Modulators

Yang Zhou
,
William C. Wetsel
,
Steven H. Olson*
, and
Lawrence S. Barak*

Journal of Chemical Information and Modeling 2025, 65, 18, 9713-9722 (Pharmaceutical Modeling)

Publication Date (Web):September 11, 2025

ABSTRACT

Identification of a Novel Core Structure of Apo-Ido1 Inhibitors Through Virtual Screening and Preliminary Hit Optimization

Yekui Yin
,
Meiqi He
,
Jianda Yue
,
Yaqi Li
,
Jiuxi Peng
,
Xiao Luo
,
Zhenyu Wang
,
Xiao He
,
Songping Liang
,
Zhonghua Liu*
, and
Ying Wang*

Journal of Chemical Information and Modeling 2025, 65, 18, 9723-9737 (Pharmaceutical Modeling)

Publication Date (Web):September 5, 2025

ABSTRACT

On-the-Fly Sequential Design of Simple Peptides

Francesco Coppola
and
Petr Král*

Journal of Chemical Information and Modeling 2025, 65, 18, 9738-9746 (Pharmaceutical Modeling)

Publication Date (Web):September 11, 2025

ABSTRACT

Water-Based Pharmacophore Modeling in Kinase Inhibitor Design: A Case Study on Fyn and Lyn Protein Kinases

Martin Ljubič
,
Marija Sollner Dolenc
,
Jure Borišek*
, and
Andrej Perdih*

Journal of Chemical Information and Modeling 2025, 65, 18, 9747-9761 (Pharmaceutical Modeling)

Publication Date (Web):August 31, 2025

ABSTRACT

Multiview Deep Learning Framework for Precise Prediction of Transcription Factor Binding Sites

Yiben Lin
,
Huiliang Luo
,
Liang Yan
,
Changmiao Wang
,
Yao Li*
, and
Ruiquan Ge*

Journal of Chemical Information and Modeling 2025, 65, 18, 9762-9776 (Bioinformatics)

Publication Date (Web):September 7, 2025

ABSTRACT

Transcription factors (TFs) are essential proteins that regulate gene expression by specifically binding to transcription factor binding sites (TFBSs) within DNA sequences. Their ability to precisely control the transcription process is crucial for understanding gene regulatory networks, uncovering disease mechanisms, and designing synthetic biology tools. Accurate TFBS prediction, therefore, holds significant importance in advancing these areas of research. While machine learning methods, particularly deep learning approaches, have achieved notable progress in TFBS prediction in recent years, several challenges persist. These include modeling the intricate structural features of the DNA double helix, capturing long-range dependencies within sequences and integrating diverse biological data sources. To address these issues, we propose an innovative solution known as multiview deep learning for Transcription Factor Binding Prediction (MDNet-TFP), which leverages multiple views of DNA sequences─including different representational forms and diverse processing strategies─to enhance prediction capabilities. Specifically, our framework introduces a bidirectional reverse complement module (BiRC-Mamba) that effectively accounts for the bidirectional and reverse complement properties characteristic of DNA sequences. Furthermore, we developed a multiscale convolutional recurrent attention network (MCRAN) that extracts both structural and functional DNA features across multiple dimensions while integrating information from various biological data sets. These advancements allow our model to outperform existing methods across 165 ChIP-seq data sets, achieving an average ACC of 88.13% (±0.47), an ROC-AUC of 93.72% (±0.15), and a PR-AUC of 93.40% (±0.21). The model not only excels with this specific data set but also maintains its high performance across a wider array of 690 ChIP-seq data sets. To further validate the model’s effectiveness, we employ motif visualization techniques. This approach reveals that the regions receiving high attention from our model align with known transcription factor binding motifs, offering valuable biological insights. Additionally, this correspondence substantiates the model’s ability to generalize and interpret complex genomic data effectively. By addressing critical limitations in the field, MDNet-TFP offers a promising new avenue for advancing research in transcriptional regulation and biomedical applications.