Eric and Wendy Schmidt Center / en With AI, researchers predict the location of virtually any protein within a human cell /news/ai-researchers-predict-location-virtually-any-protein-within-human-cell <span class="field field--name-title field--type-string field--label-hidden"><h1>With AI, researchers predict the location of virtually any protein within a human cell</h1> </span> <span class="field field--name-uid field--type-entity-reference field--label-hidden"> <span>By Corie Lok</span> </span> <span class="field field--name-created field--type-created field--label-hidden"><time datetime="2025-05-19T14:21:35-04:00" class="datetime">May 19, 2025</time> </span> <div class="hero-section container"> <div class="hero-section__row row"> <div class="hero-section__content hero-section__content_left col-6"> <div class="hero-section__breadcrumbs"> <div class="block block-system block-system-breadcrumb-block"> <nav class="breadcrumb" role="navigation" aria-labelledby="system-breadcrumb"> <h2 id="system-breadcrumb" class="visually-hidden">Breadcrumb</h2> <ol> <li> <a href="/">Home</a> </li> <li> <a href="/news">News</a> </li> </ol> </nav> </div> </div> <div class="hero-section__title"> <div class="block block-layout-builder block-field-blocknodelong-storytitle"> <span class="field field--name-title field--type-string field--label-hidden"><h1>With AI, researchers predict the location of virtually any protein within a human cell</h1> </span> </div> </div> <div class="hero-section__description"> <div class="block block-layout-builder block-field-blocknodelong-storybody"> <div class="clearfix text-formatted field field--name-body field--type-text-with-summary field--label-hidden field__item"><p>Trained with a joint understanding of protein and cell behavior, the model could help with diagnosing disease and developing new drugs.</p> </div> </div> </div> <div class="hero-section__author"> <div class="block block-layout-builder block-extra-field-blocknodelong-storyextra-field-author-custom"> By Adam Zewe, MIT News </div> </div> <div class="hero-section__date"> <div class="block block-layout-builder block-field-blocknodelong-storycreated"> <span class="field field--name-created field--type-created field--label-hidden"><time datetime="2025-05-19T14:21:35-04:00" title="Monday, May 19, 2025 - 14:21" class="datetime">May 19, 2025</time> </span> </div> </div> </div> <div class="hero-section__right col-6"> <div class="hero-section__image"> <div class="block block-layout-builder block-field-blocknodelong-storyfield-image"> <div class="field field--name-field-image field--type-entity-reference field--label-hidden field__item"> <article class="media media--type-image media--view-mode-multiple-content-types-header"> <div class="field field--name-field-media-image field--type-image field--label-hidden field__item"> <picture> <source srcset="/files/styles/multiple_ct_header_desktop_xl/public/longstory/MIT-ProteinLocalization-01-press_0.jpg?itok=t76siu8t 1x" media="all and (min-width: 1921px)" type="image/jpeg" width="754" height="503"> <source srcset="/files/styles/multiple_ct_header_desktop_xl/public/longstory/MIT-ProteinLocalization-01-press_0.jpg?itok=t76siu8t 1x" media="all and (min-width: 1601px) and (max-width: 1920px)" type="image/jpeg" width="754" height="503"> <source srcset="/files/styles/multiple_ct_header_desktop/public/longstory/MIT-ProteinLocalization-01-press_0.jpg?itok=zV_YI6-5 1x" media="all and (min-width: 1340px) and (max-width: 1600px)" type="image/jpeg" width="736" height="520"> <source srcset="/files/styles/multiple_ct_header_laptop/public/longstory/MIT-ProteinLocalization-01-press_0.jpg?itok=vojX13qG 1x" media="all and (min-width: 800px) and (max-width: 1339px)" type="image/jpeg" width="641" height="451"> <source srcset="/files/styles/multiple_ct_header_tablet/public/longstory/MIT-ProteinLocalization-01-press_0.jpg?itok=5rVZOJox 1x" media="all and (min-width: 540px) and (max-width: 799px)" type="image/jpeg" width="706" height="417"> <source srcset="/files/styles/multiple_ct_header_phone/public/longstory/MIT-ProteinLocalization-01-press_0.jpg?itok=5nOJ8bXE 1x" media="all and (max-width: 539px)" type="image/jpeg" width="499" height="294"> <img loading="eager" width="499" height="294" src="/files/styles/multiple_ct_header_phone/public/longstory/MIT-ProteinLocalization-01-press_0.jpg?itok=5nOJ8bXE" alt="Microscopy images of cells as green circles" title="Microscopy images of cells as green circles" typeof="foaf:Image"> </picture> </div> <div class="media-caption"> <div class="media-caption__credit"> Credit: Courtesy of the researchers; MIT News </div> <div class="media-caption__description"> Researchers performed validation experiments to test their new model. The top row shows the model’s prediction of unseen cell lines and proteins, while the bottom row shows the experimental validation. </div> </div> </article> </div> </div> </div> </div> </div> </div> <div class="content-section container"> <div class="content-section__main"> <div class="block block-better-social-sharing-buttons block-social-sharing-buttons-block"> <div style="display: none"><link rel="preload" href="/modules/contrib/better_social_sharing_buttons/assets/dist/sprites/social-icons--no-color.svg" as="image" type="image/svg+xml" crossorigin="anonymous"></div> <div class="social-sharing-buttons"> <a href="https://www.facebook.com/sharer/sharer.php?u=/taxonomy/term/2201/feed&amp;title=" target="_blank" title="Share to Facebook" aria-label="Share to Facebook" class="social-sharing-buttons-button share-facebook" rel="noopener"> <svg aria-hidden="true" width="32px" height="32px" style="border-radius:100%;"> <use href="/modules/contrib/better_social_sharing_buttons/assets/dist/sprites/social-icons--no-color.svg#facebook" /> </svg> </a> <a href="https://twitter.com/intent/tweet?text=+/taxonomy/term/2201/feed" target="_blank" title="Share to X" aria-label="Share to X" class="social-sharing-buttons-button share-x" rel="noopener"> <svg aria-hidden="true" width="32px" height="32px" style="border-radius:100%;"> <use href="/modules/contrib/better_social_sharing_buttons/assets/dist/sprites/social-icons--no-color.svg#x" /> </svg> </a> <a href="mailto:?subject=&amp;body=/taxonomy/term/2201/feed" title="Share to Email" aria-label="Share to Email" class="social-sharing-buttons-button share-email" target="_blank" rel="noopener"> <svg aria-hidden="true" width="32px" height="32px" style="border-radius:100%;"> <use href="/modules/contrib/better_social_sharing_buttons/assets/dist/sprites/social-icons--no-color.svg#email" /> </svg> </a> </div> </div> <div class="block block-layout-builder block-field-blocknodelong-storyfield-content-paragraphs"> <div class="field field--name-field-content-paragraphs field--type-entity-reference-revisions field--label-hidden field__items"> <div class="field__item"> <div class="paragraph paragraph--type--text-with-sidebar text-with-sidebar"> <div class="field field--name-field-sidebar field--type-entity-reference-revisions field--label-hidden field__items"> <div class="field__item"> <div class="paragraph paragraph--type--sidebar-articles sidebar-articles"> <div class="sidebar-articles__col"> <div class="clearfix text-formatted field field--name-field-heading field--type-text field--label-hidden field__item"><p>Related News</p> </div> <div class="field field--name-field-content-reference field--type-entity-reference field--label-hidden field__items"> <div class="field__item"><article about="/news/ai-tool-predicts-potential-drug-targets-analyzing-cell-images" class="node"> <div class="field field--name-field-image field--type-entity-reference field--label-hidden field__item"><article class="media media--type-image media--view-mode-multiple-ct-sidebar-link-with-image"> <div class="field field--name-field-media-image field--type-image field--label-hidden field__item"> <a href="/news/ai-tool-predicts-potential-drug-targets-analyzing-cell-images"><picture> <source srcset="/files/styles/multiple_ct_sidebar_link_with_image_desktop_xl/public/longstory/Picture%201%20Chromatin.jpg?h=53991f46&amp;itok=RN5TTvIv 1x" media="all and (min-width: 1921px)" type="image/jpeg" width="104" height="104"> <source srcset="/files/styles/multiple_ct_sidebar_link_with_image_desktop_xl/public/longstory/Picture%201%20Chromatin.jpg?h=53991f46&amp;itok=RN5TTvIv 1x" media="all and (min-width: 1601px) and (max-width: 1920px)" type="image/jpeg" width="104" height="104"> <source srcset="/files/styles/multiple_ct_sidebar_link_with_image_desktop/public/longstory/Picture%201%20Chromatin.jpg?h=53991f46&amp;itok=Yp4rRRfn 1x" media="all and (min-width: 1340px) and (max-width: 1600px)" type="image/jpeg" width="87" height="104"> <source srcset="/files/styles/multiple_ct_sidebar_link_with_image_desktop/public/longstory/Picture%201%20Chromatin.jpg?h=53991f46&amp;itok=Yp4rRRfn 1x" media="all and (min-width: 800px) and (max-width: 1339px)" type="image/jpeg" width="87" height="104"> <source srcset="/files/styles/multiple_ct_sidebar_link_with_image_tablet/public/longstory/Picture%201%20Chromatin.jpg?h=53991f46&amp;itok=tyMgiZ6H 1x" media="all and (min-width: 540px) and (max-width: 799px)" type="image/jpeg" width="285" height="186"> <source srcset="/files/styles/multiple_ct_sidebar_link_with_image_phone/public/longstory/Picture%201%20Chromatin.jpg?h=53991f46&amp;itok=c4kiAxgy 1x" media="all and (max-width: 539px)" type="image/jpeg" width="220" height="186"> <img loading="eager" width="220" height="186" src="/files/styles/multiple_ct_sidebar_link_with_image_phone/public/longstory/Picture%201%20Chromatin.jpg?h=53991f46&amp;itok=c4kiAxgy" alt="Microscopy image showing chromatin in cells as pink and purple blobs." title="Microscopy image showing chromatin in cells as pink and purple blobs." typeof="foaf:Image"> </picture></a> </div> </article> </div> <div class="node__content"> <a href="/news/ai-tool-predicts-potential-drug-targets-analyzing-cell-images" class="node__title"><span class="field field--name-title field--type-string field--label-hidden">AI tool predicts potential drug targets by analyzing cell images</span> </a> </div> </article> </div> </div> </div> </div> </div> </div> <div class="clearfix text-formatted field field--name-field-text field--type-text-long field--label-hidden field__item"><p>A protein located in the wrong part of a cell can contribute to several diseases, such as Alzheimer’s, cystic fibrosis, and cancer. But there are about 70,000 different proteins and protein variants in a single human cell, and since scientists can typically only test for a handful in one experiment, it is extremely costly and time-consuming to identify proteins’ locations manually.</p> <p>A new generation of computational techniques seeks to streamline the process using machine-learning models that often leverage datasets containing thousands of proteins and their locations, measured across multiple cell lines. One of the largest such datasets is the Human Protein Atlas, which catalogs the subcellular behavior of over 13,000 proteins in more than 40 cell lines. But as enormous as it is, the Human Protein Atlas has only explored about 0.25 percent of all possible pairings of all proteins and cell lines within the database.</p> <p>Now, researchers from the ӳý, MIT, and Harvard University have developed a new computational approach that can efficiently explore the remaining uncharted space. Their method can predict the location of any protein in any human cell line, even when both protein and cell have never been tested before.</p> <p>Their technique goes one step further than many AI-based methods by localizing a protein at the single-cell level, rather than as an averaged estimate across all the cells of a specific type. This single-cell localization could pinpoint a protein’s location in a specific cancer cell after treatment, for instance.</p> <p>The researchers combined a protein language model with a special type of computer vision model to capture rich details about a protein and cell. In the end, the user receives an image of a cell with a highlighted portion indicating the model’s prediction of where the protein is located. Since a protein’s localization is indicative of its functional status, this technique could help researchers and clinicians more efficiently diagnose diseases or identify drug targets, while also enabling biologists to better understand how complex biological processes are related to protein localization.</p> <p>“You could do these protein-localization experiments on a computer without having to touch any lab bench, hopefully saving yourself months of effort. While you would still need to verify the prediction, this technique could act like an initial screening of what to test for experimentally,” says Yitong Tseo, a graduate student in MIT’s Computational and Systems Biology program and co-lead author of a paper on this research.</p> <p>Tseo is joined on the paper by co-lead author Xinyi Zhang, a graduate student in the Department of Electrical Engineering and Computer Science (EECS) and the Eric and Wendy Schmidt Center at the ӳý; Yunhao Bai of the ӳý; and senior authors <a href="/bios/fei-chen">Fei Chen</a>, an assistant professor at Harvard and a core institute member of the ӳý, and <a href="/bios/caroline-uhler">Caroline Uhler</a>, the Andrew and Erna Viterbi Professor of Engineering in EECS and the MIT Institute for Data, Systems, and Society (IDSS), who is also director of the Eric and Wendy Schmidt Center and a researcher at MIT’s Laboratory for Information and Decision Systems (LIDS). The research appears today in <a href="https://www.nature.com/articles/s41592-025-02696-1" target="_blank"><em>Nature Methods</em></a>.</p> <h2>Collaborating models</h2> <p>Many existing protein prediction models can only make predictions based on the protein and cell data on which they were trained or are unable to pinpoint a protein’s location within a single cell.</p> <p>To overcome these limitations, the researchers created a two-part method for prediction of unseen proteins’ subcellular location, called PUPS.</p> <p>The first part utilizes a protein sequence model to capture the localization-determining properties of a protein and its 3D structure based on the chain of &nbsp;amino acids that forms it.</p> <p>The second part incorporates an image inpainting model, which is designed to fill in missing parts of an image. This computer vision model looks at three stained images of a cell to gather information about the state of that cell, such as its type, individual features, and whether it is under stress.</p> <p>PUPS joins the representations created by each model to predict where the protein is located within a single cell, using an image decoder to output a highlighted image that shows the predicted location.</p> <p>“Different cells within a cell line exhibit different characteristics, and our model is able to understand that nuance,” Tseo says.</p> <p>A user inputs the sequence of amino acids that form the protein and three cell stain images — one for the nucleus, one for the microtubules, and one for the endoplasmic reticulum. Then PUPS does the rest.</p> <h2>A deeper understanding</h2> <p>The researchers employed a few tricks during the training process to teach PUPS how to combine information from each model in such a way that it can make an educated guess on the protein’s location, even if it hasn’t seen that protein before.</p> <p>For instance, they assign the model a secondary task during training: to explicitly name the compartment of localization, like the cell nucleus. This is done alongside the primary inpainting task to help the model learn more effectively.</p> <p>A good analogy might be a teacher who asks their students to draw all the parts of a flower in addition to writing their names. This extra step was found to help the model improve its general understanding of the possible cell compartments.</p> <p>In addition, the fact that PUPS is trained on proteins and cell lines at the same time helps it develop a deeper understanding of where in a cell image proteins tend to localize.</p> <p>PUPS can even understand, on its own, how different parts of a protein’s sequence contribute separately to its overall localization.</p> <p>“Most other methods usually require you to have a stain of the protein first, so you’ve already seen it in your training data. Our approach is unique in that it can generalize across proteins and cell lines at the same time,” Zhang says.</p> <p>Because PUPS can generalize to unseen proteins, it can capture changes in localization driven by unique protein mutations that aren’t included in the Human Protein Atlas.</p> <p>The researchers verified that PUPS could predict the subcellular location of new proteins in unseen cell lines by conducting lab experiments and comparing the results. In addition, when compared to a baseline AI method, PUPS exhibited on average less prediction error across the proteins they tested.</p> <p>In the future, the researchers want to enhance PUPS so the model can understand protein-protein interactions and make localization predictions for multiple proteins within a cell. In the longer term, they want to enable PUPS to make predictions in terms of living human tissue, rather than cultured cells.</p> <p><em>Adapted from an <a href="https://news.mit.edu/2025/researchers-predict-protein-location-within-human-cell-using-ai-0515" target="_blank">MIT News story</a></em>.</p> </div> </div> </div> <div class="field__item"> <div class="paragraph paragraph--type--table-outro paragraph--view-mode--default"> <div class="field field--name-field-paragraph field--type-entity-reference-revisions field--label-hidden field__items"> <div class="field__item"> <div class="paragraph paragraph--type--table-outro-row paragraph--view-mode--default"> <div class="clearfix text-formatted field field--name-field-heading field--type-text field--label-hidden field__item"><p>Funding</p> </div> <div class="clearfix text-formatted field field--name-field-text field--type-text-long field--label-hidden field__item"><p>This research is funded by the Eric and Wendy Schmidt Center at the ӳý, the National Institutes of Health, the National Science Foundation, the Burroughs Welcome Fund, the Searle Scholars Foundation, the Harvard Stem Cell Institute, the Merkin Institute, the Office of Naval Research, and the Department of Energy.</p> </div> </div> </div> <div class="field__item"> <div class="paragraph paragraph--type--table-outro-row paragraph--view-mode--default"> <div class="clearfix text-formatted field field--name-field-heading field--type-text field--label-hidden field__item"><p>Paper cited:</p> </div> <div class="clearfix text-formatted field field--name-field-text field--type-text-long field--label-hidden field__item"><p>Zhang, X et al.&nbsp;<a href="https://www.nature.com/articles/s41592-025-02696-1" target="_blank">Prediction of protein subcellular localization in single cells</a>. <em>Nature Methods</em>. Online May 13, 2025. DOI:&nbsp;10.1038/s41592-025-02696-1</p> </div> </div> </div> </div> </div> </div> </div> </div> </div> </div> <div class="content-section container"> <div class="content-section__main"> <div class="block-node-broad-tags block block-layout-builder block-field-blocknodelong-storyfield-broad-tags"> <div class="block-node-broad-tags__row"> <div class="block-node-broad-tags__title">Tags:</div> <div class="field field--name-field-broad-tags field--type-entity-reference field--label-hidden field__items"> <div class="field__item"><a href="/broad-tags/eric-and-wendy-schmidt-center" hreflang="en">Eric and Wendy Schmidt Center</a></div> <div class="field__item"><a href="/broad-tags/artificial-intelligence" hreflang="en">Artificial intelligence</a></div> <div class="field__item"><a href="/broad-tags/caroline-uhler" hreflang="en">Caroline Uhler</a></div> <div class="field__item"><a href="/broad-tags/fei-chen" hreflang="en">Fei Chen</a></div> </div> </div> </div> </div> </div> Mon, 19 May 2025 18:21:35 +0000 Corie Lok 5558716 at AI tool predicts potential drug targets by analyzing cell images /news/ai-tool-predicts-potential-drug-targets-analyzing-cell-images <span class="field field--name-title field--type-string field--label-hidden"><h1>With AI, researchers predict the location of virtually any protein within a human cell</h1> </span> <span class="field field--name-uid field--type-entity-reference field--label-hidden"> <span>By Corie Lok</span> </span> <span class="field field--name-created field--type-created field--label-hidden"><time datetime="2025-05-19T14:21:35-04:00" class="datetime">May 19, 2025</time> </span> <div class="hero-section container"> <div class="hero-section__row row"> <div class="hero-section__content hero-section__content_left col-6"> <div class="hero-section__breadcrumbs"> <div class="block block-system block-system-breadcrumb-block"> <nav class="breadcrumb" role="navigation" aria-labelledby="system-breadcrumb"> <h2 id="system-breadcrumb" class="visually-hidden">Breadcrumb</h2> <ol> <li> <a href="/">Home</a> </li> <li> <a href="/news">News</a> </li> </ol> </nav> </div> </div> <div class="hero-section__title"> <div class="block block-layout-builder block-field-blocknodelong-storytitle"> <span class="field field--name-title field--type-string field--label-hidden"><h1>With AI, researchers predict the location of virtually any protein within a human cell</h1> </span> </div> </div> <div class="hero-section__description"> <div class="block block-layout-builder block-field-blocknodelong-storybody"> <div class="clearfix text-formatted field field--name-body field--type-text-with-summary field--label-hidden field__item"><p>Trained with a joint understanding of protein and cell behavior, the model could help with diagnosing disease and developing new drugs.</p> </div> </div> </div> <div class="hero-section__author"> <div class="block block-layout-builder block-extra-field-blocknodelong-storyextra-field-author-custom"> By Adam Zewe, MIT News </div> </div> <div class="hero-section__date"> <div class="block block-layout-builder block-field-blocknodelong-storycreated"> <span class="field field--name-created field--type-created field--label-hidden"><time datetime="2025-05-19T14:21:35-04:00" title="Monday, May 19, 2025 - 14:21" class="datetime">May 19, 2025</time> </span> </div> </div> </div> <div class="hero-section__right col-6"> <div class="hero-section__image"> <div class="block block-layout-builder block-field-blocknodelong-storyfield-image"> <div class="field field--name-field-image field--type-entity-reference field--label-hidden field__item"> <article class="media media--type-image media--view-mode-multiple-content-types-header"> <div class="field field--name-field-media-image field--type-image field--label-hidden field__item"> <picture> <source srcset="/files/styles/multiple_ct_header_desktop_xl/public/longstory/MIT-ProteinLocalization-01-press_0.jpg?itok=t76siu8t 1x" media="all and (min-width: 1921px)" type="image/jpeg" width="754" height="503"> <source srcset="/files/styles/multiple_ct_header_desktop_xl/public/longstory/MIT-ProteinLocalization-01-press_0.jpg?itok=t76siu8t 1x" media="all and (min-width: 1601px) and (max-width: 1920px)" type="image/jpeg" width="754" height="503"> <source srcset="/files/styles/multiple_ct_header_desktop/public/longstory/MIT-ProteinLocalization-01-press_0.jpg?itok=zV_YI6-5 1x" media="all and (min-width: 1340px) and (max-width: 1600px)" type="image/jpeg" width="736" height="520"> <source srcset="/files/styles/multiple_ct_header_laptop/public/longstory/MIT-ProteinLocalization-01-press_0.jpg?itok=vojX13qG 1x" media="all and (min-width: 800px) and (max-width: 1339px)" type="image/jpeg" width="641" height="451"> <source srcset="/files/styles/multiple_ct_header_tablet/public/longstory/MIT-ProteinLocalization-01-press_0.jpg?itok=5rVZOJox 1x" media="all and (min-width: 540px) and (max-width: 799px)" type="image/jpeg" width="706" height="417"> <source srcset="/files/styles/multiple_ct_header_phone/public/longstory/MIT-ProteinLocalization-01-press_0.jpg?itok=5nOJ8bXE 1x" media="all and (max-width: 539px)" type="image/jpeg" width="499" height="294"> <img loading="eager" width="499" height="294" src="/files/styles/multiple_ct_header_phone/public/longstory/MIT-ProteinLocalization-01-press_0.jpg?itok=5nOJ8bXE" alt="Microscopy images of cells as green circles" title="Microscopy images of cells as green circles" typeof="foaf:Image"> </picture> </div> <div class="media-caption"> <div class="media-caption__credit"> Credit: Courtesy of the researchers; MIT News </div> <div class="media-caption__description"> Researchers performed validation experiments to test their new model. The top row shows the model’s prediction of unseen cell lines and proteins, while the bottom row shows the experimental validation. </div> </div> </article> </div> </div> </div> </div> </div> </div> <div class="content-section container"> <div class="content-section__main"> <div class="block block-better-social-sharing-buttons block-social-sharing-buttons-block"> <div style="display: none"><link rel="preload" href="/modules/contrib/better_social_sharing_buttons/assets/dist/sprites/social-icons--no-color.svg" as="image" type="image/svg+xml" crossorigin="anonymous"></div> <div class="social-sharing-buttons"> <a href="https://www.facebook.com/sharer/sharer.php?u=/taxonomy/term/2201/feed&amp;title=" target="_blank" title="Share to Facebook" aria-label="Share to Facebook" class="social-sharing-buttons-button share-facebook" rel="noopener"> <svg aria-hidden="true" width="32px" height="32px" style="border-radius:100%;"> <use href="/modules/contrib/better_social_sharing_buttons/assets/dist/sprites/social-icons--no-color.svg#facebook" /> </svg> </a> <a href="https://twitter.com/intent/tweet?text=+/taxonomy/term/2201/feed" target="_blank" title="Share to X" aria-label="Share to X" class="social-sharing-buttons-button share-x" rel="noopener"> <svg aria-hidden="true" width="32px" height="32px" style="border-radius:100%;"> <use href="/modules/contrib/better_social_sharing_buttons/assets/dist/sprites/social-icons--no-color.svg#x" /> </svg> </a> <a href="mailto:?subject=&amp;body=/taxonomy/term/2201/feed" title="Share to Email" aria-label="Share to Email" class="social-sharing-buttons-button share-email" target="_blank" rel="noopener"> <svg aria-hidden="true" width="32px" height="32px" style="border-radius:100%;"> <use href="/modules/contrib/better_social_sharing_buttons/assets/dist/sprites/social-icons--no-color.svg#email" /> </svg> </a> </div> </div> <div class="block block-layout-builder block-field-blocknodelong-storyfield-content-paragraphs"> <div class="field field--name-field-content-paragraphs field--type-entity-reference-revisions field--label-hidden field__items"> <div class="field__item"> <div class="paragraph paragraph--type--text-with-sidebar text-with-sidebar"> <div class="field field--name-field-sidebar field--type-entity-reference-revisions field--label-hidden field__items"> <div class="field__item"> <div class="paragraph paragraph--type--sidebar-articles sidebar-articles"> <div class="sidebar-articles__col"> <div class="clearfix text-formatted field field--name-field-heading field--type-text field--label-hidden field__item"><p>Related News</p> </div> <div class="field field--name-field-content-reference field--type-entity-reference field--label-hidden field__items"> <div class="field__item"><article about="/news/ai-tool-predicts-potential-drug-targets-analyzing-cell-images" class="node"> <div class="field field--name-field-image field--type-entity-reference field--label-hidden field__item"><article class="media media--type-image media--view-mode-multiple-ct-sidebar-link-with-image"> <div class="field field--name-field-media-image field--type-image field--label-hidden field__item"> <a href="/news/ai-tool-predicts-potential-drug-targets-analyzing-cell-images"><picture> <source srcset="/files/styles/multiple_ct_sidebar_link_with_image_desktop_xl/public/longstory/Picture%201%20Chromatin.jpg?h=53991f46&amp;itok=RN5TTvIv 1x" media="all and (min-width: 1921px)" type="image/jpeg" width="104" height="104"> <source srcset="/files/styles/multiple_ct_sidebar_link_with_image_desktop_xl/public/longstory/Picture%201%20Chromatin.jpg?h=53991f46&amp;itok=RN5TTvIv 1x" media="all and (min-width: 1601px) and (max-width: 1920px)" type="image/jpeg" width="104" height="104"> <source srcset="/files/styles/multiple_ct_sidebar_link_with_image_desktop/public/longstory/Picture%201%20Chromatin.jpg?h=53991f46&amp;itok=Yp4rRRfn 1x" media="all and (min-width: 1340px) and (max-width: 1600px)" type="image/jpeg" width="87" height="104"> <source srcset="/files/styles/multiple_ct_sidebar_link_with_image_desktop/public/longstory/Picture%201%20Chromatin.jpg?h=53991f46&amp;itok=Yp4rRRfn 1x" media="all and (min-width: 800px) and (max-width: 1339px)" type="image/jpeg" width="87" height="104"> <source srcset="/files/styles/multiple_ct_sidebar_link_with_image_tablet/public/longstory/Picture%201%20Chromatin.jpg?h=53991f46&amp;itok=tyMgiZ6H 1x" media="all and (min-width: 540px) and (max-width: 799px)" type="image/jpeg" width="285" height="186"> <source srcset="/files/styles/multiple_ct_sidebar_link_with_image_phone/public/longstory/Picture%201%20Chromatin.jpg?h=53991f46&amp;itok=c4kiAxgy 1x" media="all and (max-width: 539px)" type="image/jpeg" width="220" height="186"> <img loading="eager" width="220" height="186" src="/files/styles/multiple_ct_sidebar_link_with_image_phone/public/longstory/Picture%201%20Chromatin.jpg?h=53991f46&amp;itok=c4kiAxgy" alt="Microscopy image showing chromatin in cells as pink and purple blobs." title="Microscopy image showing chromatin in cells as pink and purple blobs." typeof="foaf:Image"> </picture></a> </div> </article> </div> <div class="node__content"> <a href="/news/ai-tool-predicts-potential-drug-targets-analyzing-cell-images" class="node__title"><span class="field field--name-title field--type-string field--label-hidden">AI tool predicts potential drug targets by analyzing cell images</span> </a> </div> </article> </div> </div> </div> </div> </div> </div> <div class="clearfix text-formatted field field--name-field-text field--type-text-long field--label-hidden field__item"><p>A protein located in the wrong part of a cell can contribute to several diseases, such as Alzheimer’s, cystic fibrosis, and cancer. But there are about 70,000 different proteins and protein variants in a single human cell, and since scientists can typically only test for a handful in one experiment, it is extremely costly and time-consuming to identify proteins’ locations manually.</p> <p>A new generation of computational techniques seeks to streamline the process using machine-learning models that often leverage datasets containing thousands of proteins and their locations, measured across multiple cell lines. One of the largest such datasets is the Human Protein Atlas, which catalogs the subcellular behavior of over 13,000 proteins in more than 40 cell lines. But as enormous as it is, the Human Protein Atlas has only explored about 0.25 percent of all possible pairings of all proteins and cell lines within the database.</p> <p>Now, researchers from the ӳý, MIT, and Harvard University have developed a new computational approach that can efficiently explore the remaining uncharted space. Their method can predict the location of any protein in any human cell line, even when both protein and cell have never been tested before.</p> <p>Their technique goes one step further than many AI-based methods by localizing a protein at the single-cell level, rather than as an averaged estimate across all the cells of a specific type. This single-cell localization could pinpoint a protein’s location in a specific cancer cell after treatment, for instance.</p> <p>The researchers combined a protein language model with a special type of computer vision model to capture rich details about a protein and cell. In the end, the user receives an image of a cell with a highlighted portion indicating the model’s prediction of where the protein is located. Since a protein’s localization is indicative of its functional status, this technique could help researchers and clinicians more efficiently diagnose diseases or identify drug targets, while also enabling biologists to better understand how complex biological processes are related to protein localization.</p> <p>“You could do these protein-localization experiments on a computer without having to touch any lab bench, hopefully saving yourself months of effort. While you would still need to verify the prediction, this technique could act like an initial screening of what to test for experimentally,” says Yitong Tseo, a graduate student in MIT’s Computational and Systems Biology program and co-lead author of a paper on this research.</p> <p>Tseo is joined on the paper by co-lead author Xinyi Zhang, a graduate student in the Department of Electrical Engineering and Computer Science (EECS) and the Eric and Wendy Schmidt Center at the ӳý; Yunhao Bai of the ӳý; and senior authors <a href="/bios/fei-chen">Fei Chen</a>, an assistant professor at Harvard and a core institute member of the ӳý, and <a href="/bios/caroline-uhler">Caroline Uhler</a>, the Andrew and Erna Viterbi Professor of Engineering in EECS and the MIT Institute for Data, Systems, and Society (IDSS), who is also director of the Eric and Wendy Schmidt Center and a researcher at MIT’s Laboratory for Information and Decision Systems (LIDS). The research appears today in <a href="https://www.nature.com/articles/s41592-025-02696-1" target="_blank"><em>Nature Methods</em></a>.</p> <h2>Collaborating models</h2> <p>Many existing protein prediction models can only make predictions based on the protein and cell data on which they were trained or are unable to pinpoint a protein’s location within a single cell.</p> <p>To overcome these limitations, the researchers created a two-part method for prediction of unseen proteins’ subcellular location, called PUPS.</p> <p>The first part utilizes a protein sequence model to capture the localization-determining properties of a protein and its 3D structure based on the chain of &nbsp;amino acids that forms it.</p> <p>The second part incorporates an image inpainting model, which is designed to fill in missing parts of an image. This computer vision model looks at three stained images of a cell to gather information about the state of that cell, such as its type, individual features, and whether it is under stress.</p> <p>PUPS joins the representations created by each model to predict where the protein is located within a single cell, using an image decoder to output a highlighted image that shows the predicted location.</p> <p>“Different cells within a cell line exhibit different characteristics, and our model is able to understand that nuance,” Tseo says.</p> <p>A user inputs the sequence of amino acids that form the protein and three cell stain images — one for the nucleus, one for the microtubules, and one for the endoplasmic reticulum. Then PUPS does the rest.</p> <h2>A deeper understanding</h2> <p>The researchers employed a few tricks during the training process to teach PUPS how to combine information from each model in such a way that it can make an educated guess on the protein’s location, even if it hasn’t seen that protein before.</p> <p>For instance, they assign the model a secondary task during training: to explicitly name the compartment of localization, like the cell nucleus. This is done alongside the primary inpainting task to help the model learn more effectively.</p> <p>A good analogy might be a teacher who asks their students to draw all the parts of a flower in addition to writing their names. This extra step was found to help the model improve its general understanding of the possible cell compartments.</p> <p>In addition, the fact that PUPS is trained on proteins and cell lines at the same time helps it develop a deeper understanding of where in a cell image proteins tend to localize.</p> <p>PUPS can even understand, on its own, how different parts of a protein’s sequence contribute separately to its overall localization.</p> <p>“Most other methods usually require you to have a stain of the protein first, so you’ve already seen it in your training data. Our approach is unique in that it can generalize across proteins and cell lines at the same time,” Zhang says.</p> <p>Because PUPS can generalize to unseen proteins, it can capture changes in localization driven by unique protein mutations that aren’t included in the Human Protein Atlas.</p> <p>The researchers verified that PUPS could predict the subcellular location of new proteins in unseen cell lines by conducting lab experiments and comparing the results. In addition, when compared to a baseline AI method, PUPS exhibited on average less prediction error across the proteins they tested.</p> <p>In the future, the researchers want to enhance PUPS so the model can understand protein-protein interactions and make localization predictions for multiple proteins within a cell. In the longer term, they want to enable PUPS to make predictions in terms of living human tissue, rather than cultured cells.</p> <p><em>Adapted from an <a href="https://news.mit.edu/2025/researchers-predict-protein-location-within-human-cell-using-ai-0515" target="_blank">MIT News story</a></em>.</p> </div> </div> </div> <div class="field__item"> <div class="paragraph paragraph--type--table-outro paragraph--view-mode--default"> <div class="field field--name-field-paragraph field--type-entity-reference-revisions field--label-hidden field__items"> <div class="field__item"> <div class="paragraph paragraph--type--table-outro-row paragraph--view-mode--default"> <div class="clearfix text-formatted field field--name-field-heading field--type-text field--label-hidden field__item"><p>Funding</p> </div> <div class="clearfix text-formatted field field--name-field-text field--type-text-long field--label-hidden field__item"><p>This research is funded by the Eric and Wendy Schmidt Center at the ӳý, the National Institutes of Health, the National Science Foundation, the Burroughs Welcome Fund, the Searle Scholars Foundation, the Harvard Stem Cell Institute, the Merkin Institute, the Office of Naval Research, and the Department of Energy.</p> </div> </div> </div> <div class="field__item"> <div class="paragraph paragraph--type--table-outro-row paragraph--view-mode--default"> <div class="clearfix text-formatted field field--name-field-heading field--type-text field--label-hidden field__item"><p>Paper cited:</p> </div> <div class="clearfix text-formatted field field--name-field-text field--type-text-long field--label-hidden field__item"><p>Zhang, X et al.&nbsp;<a href="https://www.nature.com/articles/s41592-025-02696-1" target="_blank">Prediction of protein subcellular localization in single cells</a>. <em>Nature Methods</em>. Online May 13, 2025. DOI:&nbsp;10.1038/s41592-025-02696-1</p> </div> </div> </div> </div> </div> </div> </div> </div> </div> </div> <div class="content-section container"> <div class="content-section__main"> <div class="block-node-broad-tags block block-layout-builder block-field-blocknodelong-storyfield-broad-tags"> <div class="block-node-broad-tags__row"> <div class="block-node-broad-tags__title">Tags:</div> <div class="field field--name-field-broad-tags field--type-entity-reference field--label-hidden field__items"> <div class="field__item"><a href="/broad-tags/eric-and-wendy-schmidt-center" hreflang="en">Eric and Wendy Schmidt Center</a></div> <div class="field__item"><a href="/broad-tags/artificial-intelligence" hreflang="en">Artificial intelligence</a></div> <div class="field__item"><a href="/broad-tags/caroline-uhler" hreflang="en">Caroline Uhler</a></div> <div class="field__item"><a href="/broad-tags/fei-chen" hreflang="en">Fei Chen</a></div> </div> </div> </div> </div> </div> Tue, 13 May 2025 16:53:23 +0000 Corie Lok 5558561 at #WhyIScience Q&A: A systems biologist develops computational tools to bring scale to cell experiments /news/whyiscience-qa-systems-biologist-develops-computational-tools-bring-scale-cell-experiments <span class="field field--name-title field--type-string field--label-hidden"><h1>With AI, researchers predict the location of virtually any protein within a human cell</h1> </span> <span class="field field--name-uid field--type-entity-reference field--label-hidden"> <span>By Corie Lok</span> </span> <span class="field field--name-created field--type-created field--label-hidden"><time datetime="2025-05-19T14:21:35-04:00" class="datetime">May 19, 2025</time> </span> <div class="hero-section container"> <div class="hero-section__row row"> <div class="hero-section__content hero-section__content_left col-6"> <div class="hero-section__breadcrumbs"> <div class="block block-system block-system-breadcrumb-block"> <nav class="breadcrumb" role="navigation" aria-labelledby="system-breadcrumb"> <h2 id="system-breadcrumb" class="visually-hidden">Breadcrumb</h2> <ol> <li> <a href="/">Home</a> </li> <li> <a href="/news">News</a> </li> </ol> </nav> </div> </div> <div class="hero-section__title"> <div class="block block-layout-builder block-field-blocknodelong-storytitle"> <span class="field field--name-title field--type-string field--label-hidden"><h1>With AI, researchers predict the location of virtually any protein within a human cell</h1> </span> </div> </div> <div class="hero-section__description"> <div class="block block-layout-builder block-field-blocknodelong-storybody"> <div class="clearfix text-formatted field field--name-body field--type-text-with-summary field--label-hidden field__item"><p>Trained with a joint understanding of protein and cell behavior, the model could help with diagnosing disease and developing new drugs.</p> </div> </div> </div> <div class="hero-section__author"> <div class="block block-layout-builder block-extra-field-blocknodelong-storyextra-field-author-custom"> By Adam Zewe, MIT News </div> </div> <div class="hero-section__date"> <div class="block block-layout-builder block-field-blocknodelong-storycreated"> <span class="field field--name-created field--type-created field--label-hidden"><time datetime="2025-05-19T14:21:35-04:00" title="Monday, May 19, 2025 - 14:21" class="datetime">May 19, 2025</time> </span> </div> </div> </div> <div class="hero-section__right col-6"> <div class="hero-section__image"> <div class="block block-layout-builder block-field-blocknodelong-storyfield-image"> <div class="field field--name-field-image field--type-entity-reference field--label-hidden field__item"> <article class="media media--type-image media--view-mode-multiple-content-types-header"> <div class="field field--name-field-media-image field--type-image field--label-hidden field__item"> <picture> <source srcset="/files/styles/multiple_ct_header_desktop_xl/public/longstory/MIT-ProteinLocalization-01-press_0.jpg?itok=t76siu8t 1x" media="all and (min-width: 1921px)" type="image/jpeg" width="754" height="503"> <source srcset="/files/styles/multiple_ct_header_desktop_xl/public/longstory/MIT-ProteinLocalization-01-press_0.jpg?itok=t76siu8t 1x" media="all and (min-width: 1601px) and (max-width: 1920px)" type="image/jpeg" width="754" height="503"> <source srcset="/files/styles/multiple_ct_header_desktop/public/longstory/MIT-ProteinLocalization-01-press_0.jpg?itok=zV_YI6-5 1x" media="all and (min-width: 1340px) and (max-width: 1600px)" type="image/jpeg" width="736" height="520"> <source srcset="/files/styles/multiple_ct_header_laptop/public/longstory/MIT-ProteinLocalization-01-press_0.jpg?itok=vojX13qG 1x" media="all and (min-width: 800px) and (max-width: 1339px)" type="image/jpeg" width="641" height="451"> <source srcset="/files/styles/multiple_ct_header_tablet/public/longstory/MIT-ProteinLocalization-01-press_0.jpg?itok=5rVZOJox 1x" media="all and (min-width: 540px) and (max-width: 799px)" type="image/jpeg" width="706" height="417"> <source srcset="/files/styles/multiple_ct_header_phone/public/longstory/MIT-ProteinLocalization-01-press_0.jpg?itok=5nOJ8bXE 1x" media="all and (max-width: 539px)" type="image/jpeg" width="499" height="294"> <img loading="eager" width="499" height="294" src="/files/styles/multiple_ct_header_phone/public/longstory/MIT-ProteinLocalization-01-press_0.jpg?itok=5nOJ8bXE" alt="Microscopy images of cells as green circles" title="Microscopy images of cells as green circles" typeof="foaf:Image"> </picture> </div> <div class="media-caption"> <div class="media-caption__credit"> Credit: Courtesy of the researchers; MIT News </div> <div class="media-caption__description"> Researchers performed validation experiments to test their new model. The top row shows the model’s prediction of unseen cell lines and proteins, while the bottom row shows the experimental validation. </div> </div> </article> </div> </div> </div> </div> </div> </div> <div class="content-section container"> <div class="content-section__main"> <div class="block block-better-social-sharing-buttons block-social-sharing-buttons-block"> <div style="display: none"><link rel="preload" href="/modules/contrib/better_social_sharing_buttons/assets/dist/sprites/social-icons--no-color.svg" as="image" type="image/svg+xml" crossorigin="anonymous"></div> <div class="social-sharing-buttons"> <a href="https://www.facebook.com/sharer/sharer.php?u=/taxonomy/term/2201/feed&amp;title=" target="_blank" title="Share to Facebook" aria-label="Share to Facebook" class="social-sharing-buttons-button share-facebook" rel="noopener"> <svg aria-hidden="true" width="32px" height="32px" style="border-radius:100%;"> <use href="/modules/contrib/better_social_sharing_buttons/assets/dist/sprites/social-icons--no-color.svg#facebook" /> </svg> </a> <a href="https://twitter.com/intent/tweet?text=+/taxonomy/term/2201/feed" target="_blank" title="Share to X" aria-label="Share to X" class="social-sharing-buttons-button share-x" rel="noopener"> <svg aria-hidden="true" width="32px" height="32px" style="border-radius:100%;"> <use href="/modules/contrib/better_social_sharing_buttons/assets/dist/sprites/social-icons--no-color.svg#x" /> </svg> </a> <a href="mailto:?subject=&amp;body=/taxonomy/term/2201/feed" title="Share to Email" aria-label="Share to Email" class="social-sharing-buttons-button share-email" target="_blank" rel="noopener"> <svg aria-hidden="true" width="32px" height="32px" style="border-radius:100%;"> <use href="/modules/contrib/better_social_sharing_buttons/assets/dist/sprites/social-icons--no-color.svg#email" /> </svg> </a> </div> </div> <div class="block block-layout-builder block-field-blocknodelong-storyfield-content-paragraphs"> <div class="field field--name-field-content-paragraphs field--type-entity-reference-revisions field--label-hidden field__items"> <div class="field__item"> <div class="paragraph paragraph--type--text-with-sidebar text-with-sidebar"> <div class="field field--name-field-sidebar field--type-entity-reference-revisions field--label-hidden field__items"> <div class="field__item"> <div class="paragraph paragraph--type--sidebar-articles sidebar-articles"> <div class="sidebar-articles__col"> <div class="clearfix text-formatted field field--name-field-heading field--type-text field--label-hidden field__item"><p>Related News</p> </div> <div class="field field--name-field-content-reference field--type-entity-reference field--label-hidden field__items"> <div class="field__item"><article about="/news/ai-tool-predicts-potential-drug-targets-analyzing-cell-images" class="node"> <div class="field field--name-field-image field--type-entity-reference field--label-hidden field__item"><article class="media media--type-image media--view-mode-multiple-ct-sidebar-link-with-image"> <div class="field field--name-field-media-image field--type-image field--label-hidden field__item"> <a href="/news/ai-tool-predicts-potential-drug-targets-analyzing-cell-images"><picture> <source srcset="/files/styles/multiple_ct_sidebar_link_with_image_desktop_xl/public/longstory/Picture%201%20Chromatin.jpg?h=53991f46&amp;itok=RN5TTvIv 1x" media="all and (min-width: 1921px)" type="image/jpeg" width="104" height="104"> <source srcset="/files/styles/multiple_ct_sidebar_link_with_image_desktop_xl/public/longstory/Picture%201%20Chromatin.jpg?h=53991f46&amp;itok=RN5TTvIv 1x" media="all and (min-width: 1601px) and (max-width: 1920px)" type="image/jpeg" width="104" height="104"> <source srcset="/files/styles/multiple_ct_sidebar_link_with_image_desktop/public/longstory/Picture%201%20Chromatin.jpg?h=53991f46&amp;itok=Yp4rRRfn 1x" media="all and (min-width: 1340px) and (max-width: 1600px)" type="image/jpeg" width="87" height="104"> <source srcset="/files/styles/multiple_ct_sidebar_link_with_image_desktop/public/longstory/Picture%201%20Chromatin.jpg?h=53991f46&amp;itok=Yp4rRRfn 1x" media="all and (min-width: 800px) and (max-width: 1339px)" type="image/jpeg" width="87" height="104"> <source srcset="/files/styles/multiple_ct_sidebar_link_with_image_tablet/public/longstory/Picture%201%20Chromatin.jpg?h=53991f46&amp;itok=tyMgiZ6H 1x" media="all and (min-width: 540px) and (max-width: 799px)" type="image/jpeg" width="285" height="186"> <source srcset="/files/styles/multiple_ct_sidebar_link_with_image_phone/public/longstory/Picture%201%20Chromatin.jpg?h=53991f46&amp;itok=c4kiAxgy 1x" media="all and (max-width: 539px)" type="image/jpeg" width="220" height="186"> <img loading="eager" width="220" height="186" src="/files/styles/multiple_ct_sidebar_link_with_image_phone/public/longstory/Picture%201%20Chromatin.jpg?h=53991f46&amp;itok=c4kiAxgy" alt="Microscopy image showing chromatin in cells as pink and purple blobs." title="Microscopy image showing chromatin in cells as pink and purple blobs." typeof="foaf:Image"> </picture></a> </div> </article> </div> <div class="node__content"> <a href="/news/ai-tool-predicts-potential-drug-targets-analyzing-cell-images" class="node__title"><span class="field field--name-title field--type-string field--label-hidden">AI tool predicts potential drug targets by analyzing cell images</span> </a> </div> </article> </div> </div> </div> </div> </div> </div> <div class="clearfix text-formatted field field--name-field-text field--type-text-long field--label-hidden field__item"><p>A protein located in the wrong part of a cell can contribute to several diseases, such as Alzheimer’s, cystic fibrosis, and cancer. But there are about 70,000 different proteins and protein variants in a single human cell, and since scientists can typically only test for a handful in one experiment, it is extremely costly and time-consuming to identify proteins’ locations manually.</p> <p>A new generation of computational techniques seeks to streamline the process using machine-learning models that often leverage datasets containing thousands of proteins and their locations, measured across multiple cell lines. One of the largest such datasets is the Human Protein Atlas, which catalogs the subcellular behavior of over 13,000 proteins in more than 40 cell lines. But as enormous as it is, the Human Protein Atlas has only explored about 0.25 percent of all possible pairings of all proteins and cell lines within the database.</p> <p>Now, researchers from the ӳý, MIT, and Harvard University have developed a new computational approach that can efficiently explore the remaining uncharted space. Their method can predict the location of any protein in any human cell line, even when both protein and cell have never been tested before.</p> <p>Their technique goes one step further than many AI-based methods by localizing a protein at the single-cell level, rather than as an averaged estimate across all the cells of a specific type. This single-cell localization could pinpoint a protein’s location in a specific cancer cell after treatment, for instance.</p> <p>The researchers combined a protein language model with a special type of computer vision model to capture rich details about a protein and cell. In the end, the user receives an image of a cell with a highlighted portion indicating the model’s prediction of where the protein is located. Since a protein’s localization is indicative of its functional status, this technique could help researchers and clinicians more efficiently diagnose diseases or identify drug targets, while also enabling biologists to better understand how complex biological processes are related to protein localization.</p> <p>“You could do these protein-localization experiments on a computer without having to touch any lab bench, hopefully saving yourself months of effort. While you would still need to verify the prediction, this technique could act like an initial screening of what to test for experimentally,” says Yitong Tseo, a graduate student in MIT’s Computational and Systems Biology program and co-lead author of a paper on this research.</p> <p>Tseo is joined on the paper by co-lead author Xinyi Zhang, a graduate student in the Department of Electrical Engineering and Computer Science (EECS) and the Eric and Wendy Schmidt Center at the ӳý; Yunhao Bai of the ӳý; and senior authors <a href="/bios/fei-chen">Fei Chen</a>, an assistant professor at Harvard and a core institute member of the ӳý, and <a href="/bios/caroline-uhler">Caroline Uhler</a>, the Andrew and Erna Viterbi Professor of Engineering in EECS and the MIT Institute for Data, Systems, and Society (IDSS), who is also director of the Eric and Wendy Schmidt Center and a researcher at MIT’s Laboratory for Information and Decision Systems (LIDS). The research appears today in <a href="https://www.nature.com/articles/s41592-025-02696-1" target="_blank"><em>Nature Methods</em></a>.</p> <h2>Collaborating models</h2> <p>Many existing protein prediction models can only make predictions based on the protein and cell data on which they were trained or are unable to pinpoint a protein’s location within a single cell.</p> <p>To overcome these limitations, the researchers created a two-part method for prediction of unseen proteins’ subcellular location, called PUPS.</p> <p>The first part utilizes a protein sequence model to capture the localization-determining properties of a protein and its 3D structure based on the chain of &nbsp;amino acids that forms it.</p> <p>The second part incorporates an image inpainting model, which is designed to fill in missing parts of an image. This computer vision model looks at three stained images of a cell to gather information about the state of that cell, such as its type, individual features, and whether it is under stress.</p> <p>PUPS joins the representations created by each model to predict where the protein is located within a single cell, using an image decoder to output a highlighted image that shows the predicted location.</p> <p>“Different cells within a cell line exhibit different characteristics, and our model is able to understand that nuance,” Tseo says.</p> <p>A user inputs the sequence of amino acids that form the protein and three cell stain images — one for the nucleus, one for the microtubules, and one for the endoplasmic reticulum. Then PUPS does the rest.</p> <h2>A deeper understanding</h2> <p>The researchers employed a few tricks during the training process to teach PUPS how to combine information from each model in such a way that it can make an educated guess on the protein’s location, even if it hasn’t seen that protein before.</p> <p>For instance, they assign the model a secondary task during training: to explicitly name the compartment of localization, like the cell nucleus. This is done alongside the primary inpainting task to help the model learn more effectively.</p> <p>A good analogy might be a teacher who asks their students to draw all the parts of a flower in addition to writing their names. This extra step was found to help the model improve its general understanding of the possible cell compartments.</p> <p>In addition, the fact that PUPS is trained on proteins and cell lines at the same time helps it develop a deeper understanding of where in a cell image proteins tend to localize.</p> <p>PUPS can even understand, on its own, how different parts of a protein’s sequence contribute separately to its overall localization.</p> <p>“Most other methods usually require you to have a stain of the protein first, so you’ve already seen it in your training data. Our approach is unique in that it can generalize across proteins and cell lines at the same time,” Zhang says.</p> <p>Because PUPS can generalize to unseen proteins, it can capture changes in localization driven by unique protein mutations that aren’t included in the Human Protein Atlas.</p> <p>The researchers verified that PUPS could predict the subcellular location of new proteins in unseen cell lines by conducting lab experiments and comparing the results. In addition, when compared to a baseline AI method, PUPS exhibited on average less prediction error across the proteins they tested.</p> <p>In the future, the researchers want to enhance PUPS so the model can understand protein-protein interactions and make localization predictions for multiple proteins within a cell. In the longer term, they want to enable PUPS to make predictions in terms of living human tissue, rather than cultured cells.</p> <p><em>Adapted from an <a href="https://news.mit.edu/2025/researchers-predict-protein-location-within-human-cell-using-ai-0515" target="_blank">MIT News story</a></em>.</p> </div> </div> </div> <div class="field__item"> <div class="paragraph paragraph--type--table-outro paragraph--view-mode--default"> <div class="field field--name-field-paragraph field--type-entity-reference-revisions field--label-hidden field__items"> <div class="field__item"> <div class="paragraph paragraph--type--table-outro-row paragraph--view-mode--default"> <div class="clearfix text-formatted field field--name-field-heading field--type-text field--label-hidden field__item"><p>Funding</p> </div> <div class="clearfix text-formatted field field--name-field-text field--type-text-long field--label-hidden field__item"><p>This research is funded by the Eric and Wendy Schmidt Center at the ӳý, the National Institutes of Health, the National Science Foundation, the Burroughs Welcome Fund, the Searle Scholars Foundation, the Harvard Stem Cell Institute, the Merkin Institute, the Office of Naval Research, and the Department of Energy.</p> </div> </div> </div> <div class="field__item"> <div class="paragraph paragraph--type--table-outro-row paragraph--view-mode--default"> <div class="clearfix text-formatted field field--name-field-heading field--type-text field--label-hidden field__item"><p>Paper cited:</p> </div> <div class="clearfix text-formatted field field--name-field-text field--type-text-long field--label-hidden field__item"><p>Zhang, X et al.&nbsp;<a href="https://www.nature.com/articles/s41592-025-02696-1" target="_blank">Prediction of protein subcellular localization in single cells</a>. <em>Nature Methods</em>. Online May 13, 2025. DOI:&nbsp;10.1038/s41592-025-02696-1</p> </div> </div> </div> </div> </div> </div> </div> </div> </div> </div> <div class="content-section container"> <div class="content-section__main"> <div class="block-node-broad-tags block block-layout-builder block-field-blocknodelong-storyfield-broad-tags"> <div class="block-node-broad-tags__row"> <div class="block-node-broad-tags__title">Tags:</div> <div class="field field--name-field-broad-tags field--type-entity-reference field--label-hidden field__items"> <div class="field__item"><a href="/broad-tags/eric-and-wendy-schmidt-center" hreflang="en">Eric and Wendy Schmidt Center</a></div> <div class="field__item"><a href="/broad-tags/artificial-intelligence" hreflang="en">Artificial intelligence</a></div> <div class="field__item"><a href="/broad-tags/caroline-uhler" hreflang="en">Caroline Uhler</a></div> <div class="field__item"><a href="/broad-tags/fei-chen" hreflang="en">Fei Chen</a></div> </div> </div> </div> </div> </div> Tue, 20 Aug 2024 14:00:00 +0000 adicorat 5557151 at Researchers identify cheap and effective biomarkers for DCIS tumor stage /news/researchers-identify-cheap-and-effective-biomarkers-dcis-tumor-stage <span class="field field--name-title field--type-string field--label-hidden"><h1>With AI, researchers predict the location of virtually any protein within a human cell</h1> </span> <span class="field field--name-uid field--type-entity-reference field--label-hidden"> <span>By Corie Lok</span> </span> <span class="field field--name-created field--type-created field--label-hidden"><time datetime="2025-05-19T14:21:35-04:00" class="datetime">May 19, 2025</time> </span> <div class="hero-section container"> <div class="hero-section__row row"> <div class="hero-section__content hero-section__content_left col-6"> <div class="hero-section__breadcrumbs"> <div class="block block-system block-system-breadcrumb-block"> <nav class="breadcrumb" role="navigation" aria-labelledby="system-breadcrumb"> <h2 id="system-breadcrumb" class="visually-hidden">Breadcrumb</h2> <ol> <li> <a href="/">Home</a> </li> <li> <a href="/news">News</a> </li> </ol> </nav> </div> </div> <div class="hero-section__title"> <div class="block block-layout-builder block-field-blocknodelong-storytitle"> <span class="field field--name-title field--type-string field--label-hidden"><h1>With AI, researchers predict the location of virtually any protein within a human cell</h1> </span> </div> </div> <div class="hero-section__description"> <div class="block block-layout-builder block-field-blocknodelong-storybody"> <div class="clearfix text-formatted field field--name-body field--type-text-with-summary field--label-hidden field__item"><p>Trained with a joint understanding of protein and cell behavior, the model could help with diagnosing disease and developing new drugs.</p> </div> </div> </div> <div class="hero-section__author"> <div class="block block-layout-builder block-extra-field-blocknodelong-storyextra-field-author-custom"> By Adam Zewe, MIT News </div> </div> <div class="hero-section__date"> <div class="block block-layout-builder block-field-blocknodelong-storycreated"> <span class="field field--name-created field--type-created field--label-hidden"><time datetime="2025-05-19T14:21:35-04:00" title="Monday, May 19, 2025 - 14:21" class="datetime">May 19, 2025</time> </span> </div> </div> </div> <div class="hero-section__right col-6"> <div class="hero-section__image"> <div class="block block-layout-builder block-field-blocknodelong-storyfield-image"> <div class="field field--name-field-image field--type-entity-reference field--label-hidden field__item"> <article class="media media--type-image media--view-mode-multiple-content-types-header"> <div class="field field--name-field-media-image field--type-image field--label-hidden field__item"> <picture> <source srcset="/files/styles/multiple_ct_header_desktop_xl/public/longstory/MIT-ProteinLocalization-01-press_0.jpg?itok=t76siu8t 1x" media="all and (min-width: 1921px)" type="image/jpeg" width="754" height="503"> <source srcset="/files/styles/multiple_ct_header_desktop_xl/public/longstory/MIT-ProteinLocalization-01-press_0.jpg?itok=t76siu8t 1x" media="all and (min-width: 1601px) and (max-width: 1920px)" type="image/jpeg" width="754" height="503"> <source srcset="/files/styles/multiple_ct_header_desktop/public/longstory/MIT-ProteinLocalization-01-press_0.jpg?itok=zV_YI6-5 1x" media="all and (min-width: 1340px) and (max-width: 1600px)" type="image/jpeg" width="736" height="520"> <source srcset="/files/styles/multiple_ct_header_laptop/public/longstory/MIT-ProteinLocalization-01-press_0.jpg?itok=vojX13qG 1x" media="all and (min-width: 800px) and (max-width: 1339px)" type="image/jpeg" width="641" height="451"> <source srcset="/files/styles/multiple_ct_header_tablet/public/longstory/MIT-ProteinLocalization-01-press_0.jpg?itok=5rVZOJox 1x" media="all and (min-width: 540px) and (max-width: 799px)" type="image/jpeg" width="706" height="417"> <source srcset="/files/styles/multiple_ct_header_phone/public/longstory/MIT-ProteinLocalization-01-press_0.jpg?itok=5nOJ8bXE 1x" media="all and (max-width: 539px)" type="image/jpeg" width="499" height="294"> <img loading="eager" width="499" height="294" src="/files/styles/multiple_ct_header_phone/public/longstory/MIT-ProteinLocalization-01-press_0.jpg?itok=5nOJ8bXE" alt="Microscopy images of cells as green circles" title="Microscopy images of cells as green circles" typeof="foaf:Image"> </picture> </div> <div class="media-caption"> <div class="media-caption__credit"> Credit: Courtesy of the researchers; MIT News </div> <div class="media-caption__description"> Researchers performed validation experiments to test their new model. The top row shows the model’s prediction of unseen cell lines and proteins, while the bottom row shows the experimental validation. </div> </div> </article> </div> </div> </div> </div> </div> </div> <div class="content-section container"> <div class="content-section__main"> <div class="block block-better-social-sharing-buttons block-social-sharing-buttons-block"> <div style="display: none"><link rel="preload" href="/modules/contrib/better_social_sharing_buttons/assets/dist/sprites/social-icons--no-color.svg" as="image" type="image/svg+xml" crossorigin="anonymous"></div> <div class="social-sharing-buttons"> <a href="https://www.facebook.com/sharer/sharer.php?u=/taxonomy/term/2201/feed&amp;title=" target="_blank" title="Share to Facebook" aria-label="Share to Facebook" class="social-sharing-buttons-button share-facebook" rel="noopener"> <svg aria-hidden="true" width="32px" height="32px" style="border-radius:100%;"> <use href="/modules/contrib/better_social_sharing_buttons/assets/dist/sprites/social-icons--no-color.svg#facebook" /> </svg> </a> <a href="https://twitter.com/intent/tweet?text=+/taxonomy/term/2201/feed" target="_blank" title="Share to X" aria-label="Share to X" class="social-sharing-buttons-button share-x" rel="noopener"> <svg aria-hidden="true" width="32px" height="32px" style="border-radius:100%;"> <use href="/modules/contrib/better_social_sharing_buttons/assets/dist/sprites/social-icons--no-color.svg#x" /> </svg> </a> <a href="mailto:?subject=&amp;body=/taxonomy/term/2201/feed" title="Share to Email" aria-label="Share to Email" class="social-sharing-buttons-button share-email" target="_blank" rel="noopener"> <svg aria-hidden="true" width="32px" height="32px" style="border-radius:100%;"> <use href="/modules/contrib/better_social_sharing_buttons/assets/dist/sprites/social-icons--no-color.svg#email" /> </svg> </a> </div> </div> <div class="block block-layout-builder block-field-blocknodelong-storyfield-content-paragraphs"> <div class="field field--name-field-content-paragraphs field--type-entity-reference-revisions field--label-hidden field__items"> <div class="field__item"> <div class="paragraph paragraph--type--text-with-sidebar text-with-sidebar"> <div class="field field--name-field-sidebar field--type-entity-reference-revisions field--label-hidden field__items"> <div class="field__item"> <div class="paragraph paragraph--type--sidebar-articles sidebar-articles"> <div class="sidebar-articles__col"> <div class="clearfix text-formatted field field--name-field-heading field--type-text field--label-hidden field__item"><p>Related News</p> </div> <div class="field field--name-field-content-reference field--type-entity-reference field--label-hidden field__items"> <div class="field__item"><article about="/news/ai-tool-predicts-potential-drug-targets-analyzing-cell-images" class="node"> <div class="field field--name-field-image field--type-entity-reference field--label-hidden field__item"><article class="media media--type-image media--view-mode-multiple-ct-sidebar-link-with-image"> <div class="field field--name-field-media-image field--type-image field--label-hidden field__item"> <a href="/news/ai-tool-predicts-potential-drug-targets-analyzing-cell-images"><picture> <source srcset="/files/styles/multiple_ct_sidebar_link_with_image_desktop_xl/public/longstory/Picture%201%20Chromatin.jpg?h=53991f46&amp;itok=RN5TTvIv 1x" media="all and (min-width: 1921px)" type="image/jpeg" width="104" height="104"> <source srcset="/files/styles/multiple_ct_sidebar_link_with_image_desktop_xl/public/longstory/Picture%201%20Chromatin.jpg?h=53991f46&amp;itok=RN5TTvIv 1x" media="all and (min-width: 1601px) and (max-width: 1920px)" type="image/jpeg" width="104" height="104"> <source srcset="/files/styles/multiple_ct_sidebar_link_with_image_desktop/public/longstory/Picture%201%20Chromatin.jpg?h=53991f46&amp;itok=Yp4rRRfn 1x" media="all and (min-width: 1340px) and (max-width: 1600px)" type="image/jpeg" width="87" height="104"> <source srcset="/files/styles/multiple_ct_sidebar_link_with_image_desktop/public/longstory/Picture%201%20Chromatin.jpg?h=53991f46&amp;itok=Yp4rRRfn 1x" media="all and (min-width: 800px) and (max-width: 1339px)" type="image/jpeg" width="87" height="104"> <source srcset="/files/styles/multiple_ct_sidebar_link_with_image_tablet/public/longstory/Picture%201%20Chromatin.jpg?h=53991f46&amp;itok=tyMgiZ6H 1x" media="all and (min-width: 540px) and (max-width: 799px)" type="image/jpeg" width="285" height="186"> <source srcset="/files/styles/multiple_ct_sidebar_link_with_image_phone/public/longstory/Picture%201%20Chromatin.jpg?h=53991f46&amp;itok=c4kiAxgy 1x" media="all and (max-width: 539px)" type="image/jpeg" width="220" height="186"> <img loading="eager" width="220" height="186" src="/files/styles/multiple_ct_sidebar_link_with_image_phone/public/longstory/Picture%201%20Chromatin.jpg?h=53991f46&amp;itok=c4kiAxgy" alt="Microscopy image showing chromatin in cells as pink and purple blobs." title="Microscopy image showing chromatin in cells as pink and purple blobs." typeof="foaf:Image"> </picture></a> </div> </article> </div> <div class="node__content"> <a href="/news/ai-tool-predicts-potential-drug-targets-analyzing-cell-images" class="node__title"><span class="field field--name-title field--type-string field--label-hidden">AI tool predicts potential drug targets by analyzing cell images</span> </a> </div> </article> </div> </div> </div> </div> </div> </div> <div class="clearfix text-formatted field field--name-field-text field--type-text-long field--label-hidden field__item"><p>A protein located in the wrong part of a cell can contribute to several diseases, such as Alzheimer’s, cystic fibrosis, and cancer. But there are about 70,000 different proteins and protein variants in a single human cell, and since scientists can typically only test for a handful in one experiment, it is extremely costly and time-consuming to identify proteins’ locations manually.</p> <p>A new generation of computational techniques seeks to streamline the process using machine-learning models that often leverage datasets containing thousands of proteins and their locations, measured across multiple cell lines. One of the largest such datasets is the Human Protein Atlas, which catalogs the subcellular behavior of over 13,000 proteins in more than 40 cell lines. But as enormous as it is, the Human Protein Atlas has only explored about 0.25 percent of all possible pairings of all proteins and cell lines within the database.</p> <p>Now, researchers from the ӳý, MIT, and Harvard University have developed a new computational approach that can efficiently explore the remaining uncharted space. Their method can predict the location of any protein in any human cell line, even when both protein and cell have never been tested before.</p> <p>Their technique goes one step further than many AI-based methods by localizing a protein at the single-cell level, rather than as an averaged estimate across all the cells of a specific type. This single-cell localization could pinpoint a protein’s location in a specific cancer cell after treatment, for instance.</p> <p>The researchers combined a protein language model with a special type of computer vision model to capture rich details about a protein and cell. In the end, the user receives an image of a cell with a highlighted portion indicating the model’s prediction of where the protein is located. Since a protein’s localization is indicative of its functional status, this technique could help researchers and clinicians more efficiently diagnose diseases or identify drug targets, while also enabling biologists to better understand how complex biological processes are related to protein localization.</p> <p>“You could do these protein-localization experiments on a computer without having to touch any lab bench, hopefully saving yourself months of effort. While you would still need to verify the prediction, this technique could act like an initial screening of what to test for experimentally,” says Yitong Tseo, a graduate student in MIT’s Computational and Systems Biology program and co-lead author of a paper on this research.</p> <p>Tseo is joined on the paper by co-lead author Xinyi Zhang, a graduate student in the Department of Electrical Engineering and Computer Science (EECS) and the Eric and Wendy Schmidt Center at the ӳý; Yunhao Bai of the ӳý; and senior authors <a href="/bios/fei-chen">Fei Chen</a>, an assistant professor at Harvard and a core institute member of the ӳý, and <a href="/bios/caroline-uhler">Caroline Uhler</a>, the Andrew and Erna Viterbi Professor of Engineering in EECS and the MIT Institute for Data, Systems, and Society (IDSS), who is also director of the Eric and Wendy Schmidt Center and a researcher at MIT’s Laboratory for Information and Decision Systems (LIDS). The research appears today in <a href="https://www.nature.com/articles/s41592-025-02696-1" target="_blank"><em>Nature Methods</em></a>.</p> <h2>Collaborating models</h2> <p>Many existing protein prediction models can only make predictions based on the protein and cell data on which they were trained or are unable to pinpoint a protein’s location within a single cell.</p> <p>To overcome these limitations, the researchers created a two-part method for prediction of unseen proteins’ subcellular location, called PUPS.</p> <p>The first part utilizes a protein sequence model to capture the localization-determining properties of a protein and its 3D structure based on the chain of &nbsp;amino acids that forms it.</p> <p>The second part incorporates an image inpainting model, which is designed to fill in missing parts of an image. This computer vision model looks at three stained images of a cell to gather information about the state of that cell, such as its type, individual features, and whether it is under stress.</p> <p>PUPS joins the representations created by each model to predict where the protein is located within a single cell, using an image decoder to output a highlighted image that shows the predicted location.</p> <p>“Different cells within a cell line exhibit different characteristics, and our model is able to understand that nuance,” Tseo says.</p> <p>A user inputs the sequence of amino acids that form the protein and three cell stain images — one for the nucleus, one for the microtubules, and one for the endoplasmic reticulum. Then PUPS does the rest.</p> <h2>A deeper understanding</h2> <p>The researchers employed a few tricks during the training process to teach PUPS how to combine information from each model in such a way that it can make an educated guess on the protein’s location, even if it hasn’t seen that protein before.</p> <p>For instance, they assign the model a secondary task during training: to explicitly name the compartment of localization, like the cell nucleus. This is done alongside the primary inpainting task to help the model learn more effectively.</p> <p>A good analogy might be a teacher who asks their students to draw all the parts of a flower in addition to writing their names. This extra step was found to help the model improve its general understanding of the possible cell compartments.</p> <p>In addition, the fact that PUPS is trained on proteins and cell lines at the same time helps it develop a deeper understanding of where in a cell image proteins tend to localize.</p> <p>PUPS can even understand, on its own, how different parts of a protein’s sequence contribute separately to its overall localization.</p> <p>“Most other methods usually require you to have a stain of the protein first, so you’ve already seen it in your training data. Our approach is unique in that it can generalize across proteins and cell lines at the same time,” Zhang says.</p> <p>Because PUPS can generalize to unseen proteins, it can capture changes in localization driven by unique protein mutations that aren’t included in the Human Protein Atlas.</p> <p>The researchers verified that PUPS could predict the subcellular location of new proteins in unseen cell lines by conducting lab experiments and comparing the results. In addition, when compared to a baseline AI method, PUPS exhibited on average less prediction error across the proteins they tested.</p> <p>In the future, the researchers want to enhance PUPS so the model can understand protein-protein interactions and make localization predictions for multiple proteins within a cell. In the longer term, they want to enable PUPS to make predictions in terms of living human tissue, rather than cultured cells.</p> <p><em>Adapted from an <a href="https://news.mit.edu/2025/researchers-predict-protein-location-within-human-cell-using-ai-0515" target="_blank">MIT News story</a></em>.</p> </div> </div> </div> <div class="field__item"> <div class="paragraph paragraph--type--table-outro paragraph--view-mode--default"> <div class="field field--name-field-paragraph field--type-entity-reference-revisions field--label-hidden field__items"> <div class="field__item"> <div class="paragraph paragraph--type--table-outro-row paragraph--view-mode--default"> <div class="clearfix text-formatted field field--name-field-heading field--type-text field--label-hidden field__item"><p>Funding</p> </div> <div class="clearfix text-formatted field field--name-field-text field--type-text-long field--label-hidden field__item"><p>This research is funded by the Eric and Wendy Schmidt Center at the ӳý, the National Institutes of Health, the National Science Foundation, the Burroughs Welcome Fund, the Searle Scholars Foundation, the Harvard Stem Cell Institute, the Merkin Institute, the Office of Naval Research, and the Department of Energy.</p> </div> </div> </div> <div class="field__item"> <div class="paragraph paragraph--type--table-outro-row paragraph--view-mode--default"> <div class="clearfix text-formatted field field--name-field-heading field--type-text field--label-hidden field__item"><p>Paper cited:</p> </div> <div class="clearfix text-formatted field field--name-field-text field--type-text-long field--label-hidden field__item"><p>Zhang, X et al.&nbsp;<a href="https://www.nature.com/articles/s41592-025-02696-1" target="_blank">Prediction of protein subcellular localization in single cells</a>. <em>Nature Methods</em>. Online May 13, 2025. DOI:&nbsp;10.1038/s41592-025-02696-1</p> </div> </div> </div> </div> </div> </div> </div> </div> </div> </div> <div class="content-section container"> <div class="content-section__main"> <div class="block-node-broad-tags block block-layout-builder block-field-blocknodelong-storyfield-broad-tags"> <div class="block-node-broad-tags__row"> <div class="block-node-broad-tags__title">Tags:</div> <div class="field field--name-field-broad-tags field--type-entity-reference field--label-hidden field__items"> <div class="field__item"><a href="/broad-tags/eric-and-wendy-schmidt-center" hreflang="en">Eric and Wendy Schmidt Center</a></div> <div class="field__item"><a href="/broad-tags/artificial-intelligence" hreflang="en">Artificial intelligence</a></div> <div class="field__item"><a href="/broad-tags/caroline-uhler" hreflang="en">Caroline Uhler</a></div> <div class="field__item"><a href="/broad-tags/fei-chen" hreflang="en">Fei Chen</a></div> </div> </div> </div> </div> </div> Thu, 25 Jul 2024 16:05:54 +0000 tulrich@broadinstitute.org 5557156 at