Haplotype phasing in large cohorts: Modeling, search, or both?

Price Lab, Harvard School of Public Health

Inferring haploid phase from diploid genotype data -- "phasing" for short -- is a fundamental question in human genetics and a key step in genotype imputation. How should one go about phasing a large cohort? The answer depends on how large. In this talk, I will contrast two approaches to computational phasing: hidden Markov models (HMMs), which perform precise but computationally expensive statistical inference, and long-range phasing (LRP), which relies instead on rapidly searching for long genomic segments shared among samples. I will present a new LRP method (Eagle), describe its performance on N=150,000 UK Biobank samples, and discuss future directions.

MIA Talks Search