Comparison of variant callers using 60Â 532 multi-ancestry whole genome sequences.
| Authors | |
| Keywords | |
| Abstract | Whole genome sequencing (WGS) studies play a pivotal role in studying the genetic underpinnings of human diseases and traits. High quality and reproducible variant calling is the cornerstone for the success of downstream analyses, including WGS association studies and polygenic risk prediction. This paper compares the data quality, performance, and concordance of two widely used WGS variant callers, the Genome Analysis Toolkit (GATK) and Variant Tool set that discovers short variants (VT), using 60 532 multi-ancestry whole genomes sequenced by the Centers for Common Disease Genomics (CCDGs) of the NHGRI Genome Sequencing Program. Our findings show that both QCed GATK and VT pipelines yield highly consistent and reliable called Single Nucleotide Variants (SNVs) in large-scale WGS studies, supporting their agreements in joint variants calling. However, the two pipelines exhibit greater discrepancies in calling insertions and deletions (INDELs). |
| Year of Publication | 2026
|
| Journal | Briefings in bioinformatics
|
| Volume | 27
|
| Issue | 2
|
| Date Published | 03/2026
|
| ISSN | 1477-4054
|
| DOI | 10.1093/bib/bbag130
|
| PubMed ID | 41894165
|
| Links |