Custom CRISPR-Cas9 PAM variants via scalable engineering and machine learning.

Nature
Authors
Abstract

Engineering and characterizing proteins can be time-consuming and cumbersome, motivating the development of generalist CRISPR-Cas enzymes to enable diverse genome editing applications. However, such enzymes have caveats such as an increased risk of off-target editing. To enable scalable reprogramming of Cas9 enzymes, here we combined high-throughput protein engineering with machine learning (ML) to derive bespoke editors more uniquely suited to specific targets. Via structure/function-informed saturation mutagenesis and bacterial selections, we obtained nearly 1,000 engineered SpCas9 enzymes and characterized their protospacer-adjacent motif (PAM) requirements to train a neural network that relates amino acid sequence to PAM specificity. By utilizing the resulting PAM ML algorithm (PAMmla) to predict the PAMs of 64 million SpCas9 enzymes, we identified efficacious and specific enzymes that outperform evolution-based and engineered SpCas9 enzymes as nucleases and base editors in human cells while reducing off-targets. An in silico directed evolution method enables user-directed Cas9 enzyme design, including for allele-selective targeting of the RHO P23H allele in human cells and mice. Together, PAMmla integrates ML and protein engineering to curate a catalog of SpCas9 enzymes with distinct PAM requirements, and motivates the use of efficient and safe bespoke Cas9 enzymes instead of generalist enzymes for various applications.

Year of Publication
2025
Journal
Nature
Date Published
04/2025
ISSN
1476-4687
DOI
10.1038/s41586-025-09021-y
PubMed ID
40262634
Links