Whole-genome sequence assembly for mammalian genomes: Arachne 2.
| Authors | |
| Keywords | |
| Abstract | We previously described the whole-genome assembly program Arachne, presenting assemblies of simulated data for small to mid-sized genomes. Here we describe algorithmic adaptations to the program, allowing for assembly of mammalian-size genomes, and also improving the assembly of smaller genomes. Three principal changes were simultaneously made and applied to the assembly of the mouse genome, during a six-month period of development: (1) Supercontigs (scaffolds) were iteratively broken and rejoined using several criteria, yielding a 64-fold increase in length (N50), and apparent elimination of all global misjoins; (2) gaps between contigs in supercontigs were filled (partially or completely) by insertion of reads, as suggested by pairing within the supercontig, increasing the N50 contig length by 50%; (3) memory usage was reduced fourfold. The outcome of this mouse assembly and its analysis are described in (Mouse Genome Sequencing Consortium 2002). |
| Year of Publication | 2003
|
| Journal | Genome Res
|
| Volume | 13
|
| Issue | 1
|
| Pages | 91-6
|
| Date Published | 2003 Jan
|
| ISSN | 1088-9051
|
| DOI | 10.1101/gr.828403
|
| PubMed ID | 12529310
|
| PubMed Central ID | PMC430950
|
| Links |