Prediction and functional interpretation of inter-chromosomal genome architecture from DNA sequence with TwinC.
Authors | |
Abstract | Three-dimensional nuclear DNA architecture comprises well-studied intra-chromosomal () folding and less characterized inter-chromosomal () interfaces. Current predictive models of 3D genome folding can effectively infer pairwise -chromatin interactions from the primary DNA sequence but generally ignore contacts. There is an unmet need for robust models of -genome organization that provide insights into their underlying principles and functional relevance. We present TwinC, an interpretable convolutional neural network model that reliably predicts contacts measurable through genome-wide chromatin conformation capture (Hi-C). TwinC uses a paired sequence design from replicate Hi-C experiments to learn single base pair relevance in interactions across two stretches of DNA. The method achieves high predictive accuracy (AUROC=0.80) on a cross-chromosomal test set from Hi-C experiments in heart tissue. Mechanistically, the neural network learns the importance of compartments, chromatin accessibility, clustered transcription factor binding and G-quadruplexes in forming contacts. In summary, TwinC models and interprets genome architecture, shedding light on this poorly understood aspect of gene regulation. |
Year of Publication | 2024
|
Journal | bioRxiv : the preprint server for biology
|
Date Published | 09/2024
|
ISSN | 2692-8205
|
DOI | 10.1101/2024.09.16.613355
|
PubMed ID | 39345598
|
Links |