From genome to networks: A data-driven, tissue-specific view of human disease
Lewis-Sigler Institute for Integrative Genomics, Princeton University; Center for Computational Biology, Flatiron Institute
Identifying functional effects of noncoding variants is a major challenge in human genetics. I will discuss our deep learning–based algorithmic framework, DeepSEA () that predicts noncoding-variant effects de novo from genomic sequence. DeepSEA directly learns a regulatory sequence code from large-scale chromatin-profiling data, enabling prediction of chromatin effects of sequence alterations with single-nucleotide sensitivity. We further used this capability to improve prioritization of functional variants and to predict tissue-specific expression based only on genomic sequence.
I will then discuss our work on building tissue-specific networks () to understand cell- and tissue-specific gene function and regulation and application of these networks to the study of autism spectrum disorder (ASD). ASD is a complex neurodevelopmental disorder with a strong genetic basis. Yet, only a small fraction of potentially causal genes—about 65 genes out of an estimated several hundred—are known with strong genetic evidence from sequencing studies. We developed a complementary machine-learning approach based on a human brain-specific gene network to present a genome-wide prediction of autism risk genes, including hundreds of candidates for which there is minimal or no prior genetic evidence.