Molecular grammars of predicted intrinsically disordered regions that span the human proteome.
| Authors | |
| Keywords | |
| Abstract | Intrinsically disordered regions (IDRs) of proteins are defined by molecular grammars. This refers to IDR-specific non-random amino acid compositions and non-random patterning of distinct pairs of amino acid types. Here, we introduce grammars inferred using NARDINI+ (GIN) as a resource that uncovers IDR-specific and IDRome-spanning grammars. Using GIN-enabled analyses, we find that specific IDR features and GIN clusters are associated with distinct biological processes, intra-cellular localization preferences, specialized molecular functions, and functionalization as assessed by cellular fitness correlations. IDRs with exceptional grammars, defined as sequences with high-scoring non-random features, are harbored in proteins and complexes that enable spatial and temporal sorting of biochemical activities within the nucleus. Overall, GIN can be used to extract sequence-function relationships of individual IDRs or clusters of IDRs, to redesign extant IDRs or design de novo IDRs, to perform evolutionary analyses through the lens of molecular grammars and GIN clusters, and to make sense of IDR-specific disease-associated mutations. |
| Year of Publication | 2025
|
| Journal | Cell
|
| Date Published | 11/2025
|
| ISSN | 1097-4172
|
| DOI | 10.1016/j.cell.2025.10.019
|
| PubMed ID | 41232529
|
| Links |