James Hu
James, a rising senior studying computer science at the University of Florida, investigated how neural network approaches could better model complex gene-environment interactions that drive disease.
Understanding biological mechanisms behind complex disease relies on better understanding gene-environment interactions (GxE). To understand the genetic component of GxE, traditional polygenic risk scores (PRS) summarize genetic risk from a genome-wide set of variants, potentially increasing statistical power but with limited biological interpretability. BSRP has been one of the most transformative experiences of my life. With the tremendous support from my mentor, the BSRP staff, and many other scientists at the Ó³»´«Ã½, my passion for biomedical research truly blossomed. In just nine weeks, I grew not only as an aspiring scientist, but also as a person – becoming more inquisitive, courageous, and confident. Spending my days with my wonderful BSRP cohort showed me how essential a diverse, supportive community is to personal and professional growth. I’m deeply grateful for everyone I met at the Ó³»´«Ã½, and I know I will carry the lessons I’ve learned with me for the rest of my life. While genome-wide PRSxE analysis is difficult to interpret, and single-variant GxE analysis has low statistical power, recent work has shown that pathway-specific polygenic risk scores (pPRS) that aggregate pathway-relevant genetic signals offer a good balance. While this approach improves GxE detection, traditional linear model-based methods for computing pPRS struggle to model complex, non-linear genetic relationships that may drive these interactions
We address these limitations by leveraging biologically annotated neural networks (BANNs). BANNs are a promising architecture that embeds the hierarchical variant-gene-pathway structure into non-linear neural networks, theoretically improving complex genetic modeling while preserving interpretability. We evaluated whether pPRS derived from BANNs enhances GxE detection compared to linear methods, using gene-adiposity interactions affecting liver health (measured by ALT liver biomarkers) as our test case. This evaluation included a direct comparison of phenotype prediction and an assessment of GxE interaction detection in a regression model using adiposity (BMI) as the environmental exposure. We found that for all pathway interactions detected as nominally significant by linear approaches (p < 0.05), BANNs more strongly detected 20% of these interactions, like phospholipid metabolic pathways. This provides proof of concept for BANNs as a powerful and interpretable framework for interaction discovery, ultimately offering a novel approach for understanding complex disease mechanisms and laying the groundwork for genetically guided health interventions.
Project: Evaluation of biologically annotated neural networks for detecting gene-environment interactions
Mentor: Kenneth Westerman, Metabolism Program