Using random matrix theory to extract signals from single-cell expression data

Dept. of Mathematics, University of Texas at Austin

I'll describe a method for low-rank approximation of a data matrix arising from single-cell RNA sequencing data. Our basic observation is that such data is consistent with a sparse version of the "spike model" studied in random matrix theory, in which a noise matrix has a low-rank signal added in. As a consequence, the contributions from noise to the output of principal components analysis on this data may be characterized in terms of universal distributions and removed. This is joint work with Luis Aparicio, Mykola Bordyuh, and Raul Rabadan.

MIA Talks Search