InfEHR: Clinical phenotype resolution through deep geometric learning on electronic health records.

Nature communications
Authors
Abstract

Electronic health records contain multimodal data that can inform clinical decisions but are often unsuited for advanced machine learning analyses due to lack of labeled data. Here, we present InfEHR, a framework to automatically compute clinical likelihoods from whole electronic health records without requiring large volumes of labeled training data. InfEHR applies deep geometric learning through a procedure that converts whole electronic health records to temporal graphs that naturally capture phenotypic dynamics, leading to unbiased representations. Using only few labeled examples, InfEHR computes and automatically revises probabilities achieving highly performant inferences, especially in low-prevalence diseases. We test InfEHR using electronic health records from Mount Sinai Health System and UC Irvine Medical Center against physician-provided heuristics on neonatal culture-negative sepsis (3% prevalence) and postoperative acute kidney injury (21% prevalence). InfEHR demonstrated superior performance: for culture-negative sepsis (sensitivity: 0.60 vs. 0.04, specificity: 0.98 vs. 0.99) and post-operative acute kidney injury (sensitivity: 0.71 vs. 0.20, specificity: 0.93 vs. 0.98). Our study demonstrates the application of geometric deep learning in electronic health records for probabilistic inference in real-world clinical settings at scale.

Year of Publication
2025
Journal
Nature communications
Volume
16
Issue
1
Pages
8475
Date Published
09/2025
ISSN
2041-1723
DOI
10.1038/s41467-025-63366-6
PubMed ID
41006287
Links