A validated lineage-derived somatic truth data set enables benchmarking in cancer genome analysis.

Commun Biol
Authors
Abstract

Existing cancer benchmark data sets for human sequencing data use germline variants, synthetic methods, or expensive validations, none of which are satisfactory for providing a large collection of true somatic variation across a whole genome. Here we propose a data set, Lineage derived Somatic Truth (LinST), of short somatic mutations in the HT115 colon cancer cell-line, that are validated using a known cell lineage that includes thousands of mutations and a high confidence region covering 2.7 gigabases per sample.

Year of Publication
2020
Journal
Commun Biol
Volume
3
Issue
1
Pages
744
Date Published
2020 Dec 08
ISSN
2399-3642
DOI
10.1038/s42003-020-01460-9
PubMed ID
33293579
PubMed Central ID
PMC7722876
Links