MIA: Nick Polizzi, Predicting small-molecule binding sites using AlphaFold2; Primer: Benjamin Fry

Nicholas F. Polizzi
Harvard Medical School

Meeting: Predicting small-molecule binding sites in proteins using the pair representation of AlphaFold2

Identification of small-molecule binding sites in proteins is an important task for drug discovery. Despite previous homology- and machine-learning-based approaches to this problem, true de novo binding-site prediction remains a challenge. Here, we use features from a pretrained neural network to train a logistic regression model, AF2BIND, for accurate prediction of de novo binding sites. AF2BIND identifies binding sites without relying on homology modeling, multiple sequence alignments, or knowledge of a pocket-compatible ligand. Interpretable aspects of the model can be used to predict chemical properties of compatible ligands. We apply AF2BIND on the human proteome to produce a database that includes thousands of unseen binding sites in disease-relevant proteins. We anticipate AF2BIND will be used to focus drug discovery efforts and uncover functional sites in proteins across the tree of life.

 

Benjamin Fry
Polizzi Lab
Harvard Medical School

Primer: Leveraging Deep Learning Model Embeddings for Protein Property Prediction and Design

Deep learning has transformed our ability to extract meaningful representations from protein sequences and structures. These embeddings capture rich biochemical, evolutionary, and biophysical information that can be leveraged for a wide range of downstream tasks. In this talk, I will discuss recent advances in learning and applying protein embeddings for property prediction and design. We will compare how embeddings are generated in graph neural networks like AlphaFold2 and ProteinMPNN as well as in protein language models. I will show how these representations can be used to build accurate predictors of protein properties with the goal of introducing concepts necessary to understand the AF2Bind model.

 

Learn more about MIA.