Genome structure mapping with high-resolution 3D genomics and deep learning.
Authors | |
Abstract | Gene expression is often regulated by distal enhancers through cell-type-specific 3D looping interactions, but comprehensive mapping of these interactions across cell types is experimentally intractable. To address this gap, we introduce an integrated approach where we generate ultra-deep Region Capture Micro-C (RCMC) and Micro-C data specifically designed for state-of-the-art deep learning architectures. We developed Cleopatra, an attention-based deep learning model that takes epigenomic inputs and is pre-trained on genome-wide Micro-C data followed by fine-tuning with high-resolution RCMC data. Cleopatra accurately predicts 3D maps at sub-kilobase bin sizes and unprecedented resolution, enabling us to generate ultra-high-resolution, genome-wide 3D contact maps across four human cell types. These maps revealed cell-type-specific microcompartments and over 900,000 loops across the cell types, about half of which are cell-type-specific. Using Cleopatra maps, we observe that promoters form about a dozen loops on average, and that expression increases monotonically with the number of loops, indicating that looping is associated with higher gene expression. We further show the enhancer-promoter loops are often anchored by CTCF, and nominate new transcription factors that may regulate cell-type-specific enhancer-promoter interactions. Overall, we establish a framework for ultra-high-resolution 3D genome mapping, providing a broadly applicable resource for gaining new insights into cell-type-specific gene regulation. |
Year of Publication | 2025
|
Journal | bioRxiv : the preprint server for biology
|
Date Published | 05/2025
|
ISSN | 2692-8205
|
DOI | 10.1101/2025.05.06.650874
|
PubMed ID | 40654659
|
Links |