Improved Allele Frequencies in gnomAD through Local Ancestry Inference.

bioRxiv : the preprint server for biology
Authors
Keywords
Abstract

The Genome Aggregation Database (gnomAD) is a foundational resource for allele frequency data, widely used in genomic research and clinical interpretation. However, traditional estimates rely on individual-level genetic ancestry groupings that may obscure variation in recently admixed populations. To improve resolution, we applied local ancestry inference (LAI) to over 27 million variants in two admixed groups: Admixed American (n = 7,612) and African/African American (n = 20,250), deriving ancestry-specific allele frequencies. We show that 78.5% and 85.1% of variants in these groups, respectively, exhibit at least a twofold difference in ancestry-specific frequencies. Moreover, 81.49% of variants with LAI information would be assigned a higher gnomAD-wide maximum frequency after incorporating LAI, potentially altering clinical interpretations. This LAI-informed release reveals clinically relevant frequency differences that are masked in aggregate estimates and may support reclassifying some variants from Uncertain Significance to Benign or Likely Benign.

Year of Publication
2025
Journal
bioRxiv : the preprint server for biology
Date Published
06/2025
ISSN
2692-8205
DOI
10.1101/2024.10.30.620961
PubMed ID
40661606
Links