Multimodal representation learning for predicting molecule-disease relations.
| Authors | |
| Abstract | MOTIVATION: Predicting molecule-disease indications and side effects is important for drug development and pharmacovigilance. Comprehensively mining molecule-molecule, molecule-disease and disease-disease semantic dependencies can potentially improve prediction performance.METHODS: We introduce a Multi-Modal REpresentation Mapping Approach to Predicting molecular-disease relations (M2REMAP) by incorporating clinical semantics learned from electronic health records (EHR) of 12.6 million patients. Specifically, M2REMAP first learns a multimodal molecule representation that synthesizes chemical property and clinical semantic information by mapping molecule chemicals via a deep neural network onto the clinical semantic embedding space shared by drugs, diseases and other common clinical concepts. To infer molecule-disease relations, M2REMAP combines multimodal molecule representation and disease semantic embedding to jointly infer indications and side effects.RESULTS: We extensively evaluate M2REMAP on molecule indications, side effects and interactions. Results show that incorporating EHR embeddings improves performance significantly, for example, attaining an improvement over the baseline models by 23.6% in PRC-AUC on indications and 23.9% on side effects. Further, M2REMAP overcomes the limitation of existing methods and effectively predicts drugs for novel diseases and emerging pathogens.AVAILABILITY AND IMPLEMENTATION: The code is available at , and prediction results are provided at .SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. |
| Year of Publication | 2023
|
| Journal | Bioinformatics (Oxford, England)
|
| Volume | 39
|
| Issue | 2
|
| Date Published | 02/2023
|
| ISSN | 1367-4811
|
| DOI | 10.1093/bioinformatics/btad085
|
| PubMed ID | 36805623
|
| Links |