Improving atlas-scale single-cell annotation models with hierarchical cross-entropy loss.

Nature computational science
Authors
Abstract

Accurately annotating cell types is essential for extracting biological insight from single-cell RNA sequencing data. Although cell types are naturally organized into hierarchical ontologies, most computational models do not explicitly incorporate this structure into their training objectives. Here, we introduce a hierarchical cross-entropy loss that aligns model objectives with biological structure. Applied to architectures ranging from linear models to transformers, this simple modification improves out-of-distribution performance by 12-15% without added computational cost. Critically, we underscore the need to focus on new data generation that improves the connectivity among annotated cell types. Our work suggests that this is likely to yield more generalizable algorithms than would solely increasing model complexity.

Year of Publication
2026
Journal
Nature computational science
Date Published
01/2026
ISSN
2662-8457
DOI
10.1038/s43588-025-00945-z
PubMed ID
41617882
Links