Machine Learning-Based Plasma Protein Risk Score Improves Atrial Fibrillation Prediction Over Clinical and Genomic Models.
Authors | |
Keywords | |
Abstract | BACKGROUND: Clinical factors discriminate incident atrial fibrillation (AF) risk with moderate accuracy, with only modest improvement after incorporation of polygenic risk scores. Whether emerging large-scale proteomic profiling can augment AF risk estimation is unknown.METHODS: In the UK Biobank cohort, we derived and validated a machine learning model to predict incident AF risk using serum proteins (Pro-AF). We compared Pro-AF to a validated clinical risk score (Cohorts for Heart and Aging Research in Genomic Epidemiology-Atrial Fibrillation, CHARGE-AF) and an AF polygenic risk score. Models were evaluated in a multiply resampled test set from nested cross-validation (internal test set), and a sample of UK Biobank participants separate from model development (hold-out test set). Metrics included discrimination of 5-year incident AF using time-dependent area under the receiver operating characteristic curve and net reclassification.RESULTS: Trained in 32 631 UK Biobank participants, Pro-AF predicts incident AF using 121 protein levels (out of 2911 protein analytes). When assessed in the internal test set comprising 30 632 individuals (mean age 57±8 years, 54% women, 2045 AF events) and hold-out test set comprising 13 998 individuals (mean age 57±8 years, 54% women, 870 AF events), discrimination of 5-year incident AF was highest using Pro-AF (area under the receiver operating characteristic curve internal: 0.761 [95% CI, 0.745-0.780], hold-out: 0.763 [0.734-0.784]), followed by CHARGE-AF (0.719 [0.700-0.737]; 0.702 [0.668-0.730]) and the polygenic risk score (0.686 [0.668-0.702]; 0.682 [0.660-0.710]). AF risk estimates were well-calibrated, and the addition of Pro-AF led to substantial continuous net reclassification improvement over CHARGE-AF (eg, internal test set 0.410 [0.330-0.492]). A simplified Pro-AF including only the 5 most influential proteins (NT-proBNP [N-terminal pro-brain natriuretic peptide], EDA2R [ectodysplasin A2 receptor], NPPB [B-type natriuretic peptide], BCAN [brevican core protein], and GDF15 [growth/differentiation factor 15]), retained favorable discriminative value (area under the receiver operating characteristic curve internal: 0.750 [0.733-0.768]; hold-out: 0.759 [0.732-0.790]).CONCLUSIONS: A machine learning-based protein score discriminates 5-year incident AF risk favorably compared with clinical and genetic risk factors. Large-scale proteomic analysis may assist in the prioritization of individuals at risk for AF for screening and related preventive interventions. |
Year of Publication | 2025
|
Journal | Circulation. Genomic and precision medicine
|
Volume | 18
|
Issue | 4
|
Pages | e004943
|
Date Published | 08/2025
|
ISSN | 2574-8300
|
DOI | 10.1161/CIRCGEN.124.004943
|
PubMed ID | 40525300
|
Links |