Inferring compound heterozygosity from large-scale exome sequencing data.
Authors | |
Abstract | Recessive diseases arise when both the maternal and the paternal copies of a gene are impacted by a damaging genetic variant in the affected individual. When a patient carries two different potentially causal variants in a gene for a given disorder, accurate diagnosis requires determining that these two variants occur on different copies of the chromosome (i.e., are in ) rather than on the same copy (i.e. in ). However, current approaches for determining phase, beyond parental testing, are limited in clinical settings. We developed a strategy for inferring phase for rare variant pairs within genes, leveraging genotypes observed in exome sequencing data from the Genome Aggregation Database (gnomAD v2, n=125,748). When applied to trio data where phase can be determined by transmission, our approach estimates phase with 95.7% accuracy and remains accurate even for very rare variants (allele frequency < 1×10). We also correctly phase 95.9% of variant pairs in a set of 293 patients with Mendelian conditions carrying presumed causal compound heterozygous variants. We provide a public resource of phasing estimates from gnomAD, including phasing estimates for coding variants across the genome and counts per gene of rare variants in , that can aid interpretation of rare co-occurring variants in the context of recessive disease. |
Year of Publication | 2023
|
Journal | bioRxiv : the preprint server for biology
|
Date Published | 08/2023
|
DOI | 10.1101/2023.03.19.533370
|
PubMed ID | 36993580
|
Links |