Comprehensive variation discovery in single human genomes.
| Authors | |
| Keywords | |
| Abstract | Complete knowledge of the genetic variation in individual human genomes is a crucial foundation for understanding the etiology of disease. Genetic variation is typically characterized by sequencing individual genomes and comparing reads to a reference. Existing methods do an excellent job of detecting variants in approximately 90% of the human genome; however, calling variants in the remaining 10% of the genome (largely low-complexity sequence and segmental duplications) is challenging. To improve variant calling, we developed a new algorithm, DISCOVAR, and examined its performance on improved, low-cost sequence data. Using a newly created reference set of variants from the finished sequence of 103 randomly chosen fosmids, we find that some standard variant call sets miss up to 25% of variants. We show that the combination of new methods and improved data increases sensitivity by several fold, with the greatest impact in challenging regions of the human genome. |
| Year of Publication | 2014
|
| Journal | Nat Genet
|
| Volume | 46
|
| Issue | 12
|
| Pages | 1350-5
|
| Date Published | 2014 Dec
|
| ISSN | 1546-1718
|
| URL | |
| DOI | 10.1038/ng.3121
|
| PubMed ID | 25326702
|
| PubMed Central ID | PMC4244235
|
| Links | |
| Grant list | HHSN272200900018C / AI / NIAID NIH HHS / United States
U54 HG003067 / HG / NHGRI NIH HHS / United States
R01 HG003474 / HG / NHGRI NIH HHS / United States
U54HG003067 / HG / NHGRI NIH HHS / United States
HHSN272200900018C / PHS HHS / United States
R01HG003474 / HG / NHGRI NIH HHS / United States
|