Explainable Physicochemical Determinants of Protein-Ligand Binding via Non-Covalent Interactions.
| Authors | |
| Abstract | Protein-ligand binding (PLB) underlies a broad range of biological and chemical processes, including enzymatic catalysis, metabolic regulation, molecular recognition, and therapeutic modulation. These processes are governed by structured constellations of non-covalent interactions-such as hydrogen bonding, electrostatic interactions, hydrophobic contacts, and aromatic stacking-that determine binding affinity and functional specificity. However, many AI-based models prioritize predictive performance, with limited emphasis on mechanistic interpretation of the underlying interactions. As a result, existing approaches often function as black-box predictors limiting their utility for mechanistic reasoning, functional modulation, and generalization to novel protein-ligand pairs. Here, we introduce , an interaction-aware sequence-based framework that jointly predicts protein-ligand binding likelihood and learns residue-level interaction patterns grounded in physical non-covalent interactions. To enable explicit supervision, we construct , a curated database of protein-ligand complexes from which residue-atom interaction maps are systematically extracted. ExplainBind aligns token-level cross-attention with these interaction maps, enabling physically grounded and mechanistically interpretable predictions without requiring explicit three-dimensional inputs at inference time. Across extensive in-distribution and out-of-distribution evaluations, ExplainBind consistently outperforms existing sequence-based baselines and demonstrates improved robustness to protein and ligand distribution shifts. Quantitative analyses and structural case studies show that the learned interaction maps accurately localize binding pockets and recover known interaction motifs. We further validate the framework by effectively ranking highly potent inhibitors for angiotensin-converting enzyme (ACE) and discovering both inhibitors and activators of the metabolic enzyme L-2-hydroxyglutarate dehydrogenase (L2HGDH), illustrating how ExplainBind supports functional modulation beyond binary binding prediction. Together, these results establish ExplainBind as a scalable and interpretable paradigm for protein-ligand binding prediction across drug discovery, enzyme engineering, and broader molecular design applications. |
| Year of Publication | 2026
|
| Journal | bioRxiv : the preprint server for biology
|
| Date Published | 03/2026
|
| ISSN | 2692-8205
|
| DOI | 10.64898/2026.03.03.707476
|
| PubMed ID | 41867731
|
| Links |