Discovery of disease-associated cellular states using ResidPCA in single-cell RNA and ATAC sequencing data.
| Authors | |
| Abstract | To advance understanding of cellular heterogeneity in disease from single-cell sequencing data, we introduce Residual Principal Component Analysis (ResidPCA), a robust method for identifying cell states that explicitly models cell type heterogeneity. In simulations, ResidPCA achieved more than fourfold higher accuracy than conventional Principal Component Analysis (PCA) and over threefold higher accuracy than Non-negative Matrix Factorization (NMF)-based methods in detecting states expressed across multiple cell types. Applied to single-cell RNA sequencing (scRNA-seq) of light-stimulated mouse visual cortex cells, ResidPCA captured stimulus-driven variability with an accuracy more than fivefold higher than NMF-based approaches. In single-nucleus datasets from an Alzheimer's disease cohort, ResidPCA identified 44 chromatin accessibility-based states from single-nucleus ATAC-seq (snATAC-seq) and 42 transcriptional states from single-nucleus RNA-seq (snRNA-seq). Thirty snATAC-seq states were significantly enriched for Alzheimer's disease heritability, often more so than established cell types such as microglia. The snATAC-seq state most significantly enriched for heritability further elucidates a recently implicated neuron-oligodendrocyte-microglial mechanistic axis, linking early amyloid production in neurons and oligodendrocytes with later microglial activation and immune response. Together, these results highlight ResidPCA's ability to uncover previously hidden biological variation in single-cell data and reveal disease-relevant cell states. |
| Year of Publication | 2025
|
| Journal | HGG advances
|
| Pages | 100538
|
| Date Published | 10/2025
|
| ISSN | 2666-2477
|
| DOI | 10.1016/j.xhgg.2025.100538
|
| PubMed ID | 41157948
|
| Links |