SeqLib: a C ++ API for rapid BAM manipulation, sequence alignment and sequence assembly.

Bioinformatics
Authors
Abstract

We present SeqLib, a C ++ API and command line tool that provides a rapid and user-friendly interface to BAM/SAM/CRAM files, global sequence alignment operations and sequence assembly. Four C libraries perform core operations in SeqLib: HTSlib for BAM access, BWA-MEM and BLAT for sequence alignment and Fermi for error correction and sequence assembly. Benchmarking indicates that SeqLib has lower CPU and memory requirements than leading C ++ sequence analysis APIs. We demonstrate an example of how minimal SeqLib code can extract, error-correct and assemble reads from a CRAM file and then align with BWA-MEM. SeqLib also provides additional capabilities, including chromosome-aware interval queries and read plotting. Command line tools are available for performing integrated error correction, micro-assemblies and alignment.

AVAILABILITY AND IMPLEMENTATION: SeqLib is available on Linux and OSX for the C ++98 standard and later at github.com/walaj/SeqLib. SeqLib is released under the Apache2 license. Additional capabilities for BLAT alignment are available under the BLAT license.

CONTACT: jwala@broadinstitue.org; rameen@broadinstitute.org.

Year of Publication
2016
Journal
Bioinformatics
Date Published
2016 Dec 22
ISSN
1367-4811
DOI
10.1093/bioinformatics/btw741
PubMed ID
28011768
Links
Grant list
R01 CA188228 / CA / NCI NIH HHS / United States