Partially shared multi-modal embedding learns holistic representation of cell state.

Nature computational science
Authors
Abstract

Current technologies enable the simultaneous measurement of diverse data types at the single-cell level. However, data are often processed separately, or integrated via representation learning methods that obscure the contributions of each data modality. Here we present a computational framework that automatically learns partial information sharing between multiple modalities by using an Autoencoder with a Partially Overlapping Latent space learned through Latent Optimization (APOLLO). We tested APOLLO on simulated data, and on four applications involving paired single-cell data: SHARE-seq (scRNA-seq and scATAC-seq), CITE-seq (scRNA-seq and protein abundance), and two multiplexed imaging datasets. APOLLO enables the prediction of missing modalities, such as unmeasured protein stains, and allows disentangling which modality or cellular compartment is linked with a specific phenotype, such as the variability in protein localization observed across single cells. Overall, APOLLO efficiently integrates diverse data modalities and, by retaining and distinguishing between shared and modality-specific information, provides a more interpretable and holistic view of cell state.

Year of Publication
2026
Journal
Nature computational science
Date Published
02/2026
ISSN
2662-8457
DOI
10.1038/s43588-025-00948-w
PubMed ID
41741805
Links