Integrating single-cell RNA-seq datasets with substantial batch effects.

BMC genomics
Authors
Keywords
Abstract

Integration of single-cell RNA-sequencing (scRNA-seq) datasets is standard in scRNA-seq analysis. Nevertheless, current computational methods struggle to harmonize datasets across systems such as species, organoids and primary tissue, or different scRNA-seq protocols, including single-cell and single-nuclei. Conditional variational autoencoders (cVAE) are a popular integration method, however, existing strategies for stronger batch correction have limitations. Increasing the Kullback-Leibler divergence regularization does not improve integration and adversarial learning removes biological signals. Here, we propose sysVI, a cVAE-based method employing VampPrior and cycle-consistency constraints. We show that sysVI integrates across systems and improves biological signals for downstream interpretation of cell states and conditions.

Year of Publication
2025
Journal
BMC genomics
Volume
26
Issue
1
Pages
974
Date Published
10/2025
ISSN
1471-2164
DOI
10.1186/s12864-025-12126-3
PubMed ID
41168710
Links