Minimap and miniasm: fast mapping and de novo assembly for noisy long sequences.
| Authors | |
| Abstract | MOTIVATION: Single Molecule Real-Time (SMRT) sequencing technology and Oxford Nanopore technologies (ONT) produce reads over 10 kb in length, which have enabled high-quality genome assembly at an affordable cost. However, at present, long reads have an error rate as high as 10-15%. Complex and computationally intensive pipelines are required to assemble such reads. RESULTS: We present a new mapper, minimap and a de novo assembler, miniasm, for efficiently mapping and assembling SMRT and ONT reads without an error correction stage. They can often assemble a sequencing run of bacterial data into a single contig in a few minutes, and assemble 45-fold Caenorhabditis elegans data in 9 min, orders of magnitude faster than the existing pipelines, though the consensus sequence error rate is as high as raw reads. We also introduce a pairwise read mapping format and a graphical fragment assembly format, and demonstrate the interoperability between ours and current tools. AVAILABILITY AND IMPLEMENTATION: and CONTACT: hengli@broadinstitute.org SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. |
| Year of Publication | 2016
|
| Journal | Bioinformatics
|
| Volume | 32
|
| Issue | 14
|
| Pages | 2103-10
|
| Date Published | 2016 Jul 15
|
| ISSN | 1367-4811
|
| URL | |
| DOI | 10.1093/bioinformatics/btw152
|
| PubMed ID | 27153593
|
| PubMed Central ID | PMC4937194
|
| Links | |
| Grant list | R01 GM100233 / GM / NIGMS NIH HHS / United States
|