Pancreatic cancer risk prediction using deep sequential modeling of longitudinal diagnostic and medication records.

Cell reports. Medicine
Authors
Keywords
Abstract

Pancreatic ductal adenocarcinoma (PDAC) is a rare, aggressive cancer often diagnosed late with low survival rates, due to the lack of population-wide screening programs and the high cost of early detection methods. To enable early detection of high-risk individuals, we develop a transformer-based model trained on longitudinal Veterans Affairs electronic health record (EHR) with 19,426 PDAC cases and ∼15.9 million controls. Our model combines diagnostic and medication trajectories to predict PDAC risk within a 6-, 12-, and 36-month assessment window. Incorporating medication significantly improved performance; among the top 1,000-5,000 highest-risk patients in a cohort of 1 million patients, 3-year PDAC incidence is 115-70 times higher than a reference estimate based on age and sex alone. Furthermore, analysis of most predictive features highlights the role of events such as chronic inflammatory conditions and specific medications on overall PDAC risk. Our work provides an AI-driven identification of high-risk individuals, with a potential to improve early detection, enhance patient care, and reduce healthcare costs.

Year of Publication
2025
Journal
Cell reports. Medicine
Volume
6
Issue
9
Pages
102359
Date Published
09/2025
ISSN
2666-3791
DOI
10.1016/j.xcrm.2025.102359
PubMed ID
40961920
Links