PMCID
PMC13001423

Explainable Physicochemical Determinants of Protein-Ligand Binding via Non-Covalent Interactions.

bioRxiv : the preprint server for biology
Authors
Abstract

Protein-ligand binding (PLB) underlies a broad range of biological and chemical processes, including enzymatic catalysis, metabolic regulation, molecular recognition, and therapeutic modulation. These processes are governed by structured constellations of non-covalent interactions-such as hydrogen bonding, electrostatic interactions, hydrophobic contacts, and aromatic stacking-that determine binding affinity and functional specificity. However, many AI-based models prioritize predictive performance, with limited emphasis on mechanistic interpretation of the underlying interactions. As a result, existing approaches often function as black-box predictors limiting their utility for mechanistic reasoning, functional modulation, and generalization to novel protein-ligand pairs. Here, we introduce , an interaction-aware sequence-based framework that jointly predicts protein-ligand binding likelihood and learns residue-level interaction patterns grounded in physical non-covalent interactions. To enable explicit supervision, we construct , a curated database of protein-ligand complexes from which residue-atom interaction maps are systematically extracted. ExplainBind aligns token-level cross-attention with these interaction maps, enabling physically grounded and mechanistically interpretable predictions without requiring explicit three-dimensional inputs at inference time. Across extensive in-distribution and out-of-distribution evaluations, ExplainBind consistently outperforms existing sequence-based baselines and demonstrates improved robustness to protein and ligand distribution shifts. Quantitative analyses and structural case studies show that the learned interaction maps accurately localize binding pockets and recover known interaction motifs. We further validate the framework by effectively ranking highly potent inhibitors for angiotensin-converting enzyme (ACE) and discovering both inhibitors and activators of the metabolic enzyme L-2-hydroxyglutarate dehydrogenase (L2HGDH), illustrating how ExplainBind supports functional modulation beyond binary binding prediction. Together, these results establish ExplainBind as a scalable and interpretable paradigm for protein-ligand binding prediction across drug discovery, enzyme engineering, and broader molecular design applications.

Year of Publication
2026
Journal
bioRxiv : the preprint server for biology
Date Published
03/2026
ISSN
2692-8205
DOI
10.64898/2026.03.03.707476
PubMed ID
41867731
Links