Ancestry-specific performance of variant effect predictors in clinical variant classification.
| Authors | |
| Abstract | Predicting the effects of genetic variants and assessing prediction performance are key computational tasks in genomic medicine. It has been shown that well-calibrated variant effect predictors can be reliably used as evidence towards establishing pathogenicity (or benignity) of missense variants, thereby rendering these variants suitable for use in (or exclusion from) the genetic diagnosis of rare Mendelian conditions. However, most predictors have been trained or calibrated on data that may not be sufficiently representative to lead to similar performance across all genetic ancestries. This raises questions about the responsible deployment of these tools to improve human health. To better understand the utility of computational predictors, we set out to assess their ancestry-specific performance in terms of accuracy and evidence strength according to the ACMG/AMP guidelines. First, we determined that the expected count of rare variants in an individual's genome and the allele frequency distribution of these variants are the key confounders when evaluating a predictor's performance across different genetic ancestries. Second, we found that a predictor's accuracy itself inversely correlates with the allele frequency of the rare variant. After stratifying according to allele frequency, we show that established methods for predicting the pathogenicity of missense variants have comparable performance levels across major ancestry groups. Our results therefore support the wide deployment of such models in the context of genetic diagnosis and related applications. |
| Year of Publication | 2026
|
| Journal | bioRxiv : the preprint server for biology
|
| Date Published | 02/2026
|
| ISSN | 2692-8205
|
| DOI | 10.64898/2026.02.14.705914
|
| PubMed ID | 41756911
|
| Links |