DiscoDivas: Leveraging genetic-ancestry continuum information to interpolate PRS for admixed populations.

American journal of human genetics
Authors
Keywords
Abstract

Genome-wide association study (GWAS) summary statistics for training and individual-level cohorts for fine-tuning are essential for constructing predictive polygenic risk score (PRS) models. However, the relatively low representation of admixed populations in both GWAS summary statistics and individual-level datasets hinders the development of PRSs and equitable clinical translation for admixed populations. Prior work indicates that the most informative PRS model for a genetically homogeneous sample varies linearly in an ancestry continuum space. Guided by these observations, we introduce a genetic-distance-assisted PRS combination pipeline for diverse genetic ancestries (DiscoDivas) to interpolate a harmonized PRS for diverse, especially admixed, genetic ancestries. DiscoDivas leverages multiple PRS models fine-tuned within existing samples, which are mostly of single ancestry, and genetic distance. It provides a new approach to generate genetic-ancestry-specific PRSs when a suitably matched individual-level fine-tuning cohort is unavailable or underpowered. DiscoDivas treats genetic ancestry as a continuous variable and does not require shifting across different models when calculating PRSs for different ancestries. We generated PRSs with DiscoDivas and the current conventional method, i.e., fine-tuning multiple GWAS PRSs using the matched or similar genetic-ancestry samples. DiscoDivas generated a harmonized PRS, performing comparable to or better than the conventional approach, with the greatest advantage exhibited in admixed individuals.

Year of Publication
2026
Journal
American journal of human genetics
Date Published
06/2026
ISSN
1537-6605
DOI
10.1016/j.ajhg.2026.05.006
PubMed ID
42235505
Links