Primer: Robust nonlinear manifold learning for single cell RNA-seq data

Engelhardt Group, Depts. of Computer Science, Quantitative and Computational Biology, Princeton University

Analysis of single cell RNA sequencing (scRNA-seq) experiments requires dimension reduction for regularization and efficiency. We present a nonlinear latent variable model with robust, heavy-tail error modeling and adaptive kernel learning to capture low dimensional nonlinear structure in scRNA-seq data. Gene expression is modeled as a noisy draw from a Gaussian process in high dimensions from latent positions, known as a Gaussian Process Latent Variable Model (GPLVM). We model residual errors with a heavy-tailed Student's t-distribution to control for observed technical and biological noise. We compare our approach to common dimension reduction tools to highlight our model's ability to enable important downstream tasks, including clustering and inferring cell developmental trajectories, on available experimental data. We show that our robust nonlinear manifold is well suited for raw, unfiltered gene counts from high throughput sequencing technologies for visualization and exploration of cell states.

MIA Talks Search