Primer: Generative models from NLP for sequence data
Marks Lab, Harvard Medical School, Ó³»´«Ã½
Generative models are powerful tools for capturing functional constraints within families of biological sequences. Autoregressive models, developed in natural language processing and related fields, provide a useful approach to modeling sequence data without imposing a rigid alignment structure on the data. In this primer, we will review the math and intuition behind these models, survey advancements in model parameterization, and compare strategies for sampling from the models to generate new sequences. Finally, we will discuss important considerations when applying these models to biological data.