Pipeline Olympics: continuable benchmarking of computational workflows for DNA methylation sequencing data against an experimental gold standard.
Authors | |
Abstract | DNA methylation is a widely studied epigenetic mark and a powerful biomarker of cell type, age, environmental exposures, and disease. Whole-genome sequencing following selective conversion of unmethylated cytosines into thymines via bisulfite treatment or enzymatic methods remains the reference method for DNA methylation profiling genome-wide. While numerous software tools facilitate processing of DNA methylation sequencing reads, a comprehensive benchmarking study has been lacking. In this study, we systematically compared complete computational workflows for processing DNA methylation sequencing data using a dedicated benchmarking dataset generated with five whole-genome profiling protocols. As an evaluation reference, we employed accurate locus-specific measurements from our previous benchmark of targeted DNA methylation assays. Based on this experimental gold-standard assessment and multiple performance metrics, we identified workflows that consistently demonstrated superior performance and revealed major workflow development trends. To ensure the long-term utility of our benchmark, we implemented an interactive workflow execution and data presentation platform, adaptable to user-defined criteria and readily expandable to future software. |
Year of Publication | 2025
|
Journal | Nucleic acids research
|
Volume | 53
|
Issue | 19
|
Date Published | 10/2025
|
ISSN | 1362-4962
|
DOI | 10.1093/nar/gkaf970
|
PubMed ID | 41118575
|
Links |