Blended Length Genome Sequencing (blend-seq): Combining Short Reads with Low-Coverage Long Reads to Maximize Variant Discovery.
Authors | |
Keywords | |
Abstract | We introduce blend-seq, a workflow for combining data from traditional short-read sequencing pipelines with low-coverage long reads, to improve variant discovery for single samples without the full cost of high-coverage long reads. We demonstrate that with only 4x long-read coverage augmenting 30x short reads, we can improve SNP discovery across the genome, exceeding performance beyond even high-coverage short reads (60x). For genotype-agnostic discovery of structural variants, we see a threefold improvement in recall while maintaining precision by using the low-coverage long reads on their own, and show how we can improve genotyping accuracy by adding in the short-read data. In addition, we demonstrate how the long reads can better phase these variants, incorporating long-context information in the genome to substantially outperform phasing with short reads alone. Our experiments highlight the complementary nature of short- and long-read technologies: the former contributing higher depth for genotyping and the latter better resolution of larger events or those in difficult regions. |
Year of Publication | 2025
|
Journal | bioRxiv : the preprint server for biology
|
Date Published | 09/2025
|
ISSN | 2692-8205
|
DOI | 10.1101/2024.11.01.621515
|
PubMed ID | 40950019
|
Links |