BioAutoMATED: An end-to-end automated machine learning tool for explanation and design of biological sequences.

Cell systems
Authors
Keywords
Abstract

The design choices underlying machine-learning (ML) models present important barriers to entry for many biologists who aim to incorporate ML in their research. Automated machine-learning (AutoML) algorithms can address many challenges that come with applying ML to the life sciences. However, these algorithms are rarely used in systems and synthetic biology studies because they typically do not explicitly handle biological sequences (e.g., nucleotide, amino acid, or glycan sequences) and cannot be easily compared with other AutoML algorithms. Here, we present BioAutoMATED, an AutoML platform for biological sequence analysis that integrates multiple AutoML methods into a unified framework. Users are automatically provided with relevant techniques for analyzing, interpreting, and designing biological sequences. BioAutoMATED predicts gene regulation, peptide-drug interactions, and glycan annotation, and designs optimized synthetic biology components, revealing salient sequence characteristics. By automating sequence modeling, BioAutoMATED allows life scientists to incorporate ML more readily into their work.

Year of Publication
2023
Journal
Cell systems
Volume
14
Issue
6
Pages
525-542.e9
Date Published
06/2023
ISSN
2405-4720
DOI
10.1016/j.cels.2023.05.007
PubMed ID
37348466
Links