#WhyIScience Q&A: A systems biologist uses AI to understand how the genome controls cell fate

Bo Xia leads a lab that uses machine learning to learn about fundamental biological processes in human health and disease.

headshot of Bo Xia in his office
Credit: Allison Colorado, Ó³»­´«Ã½ Communications
A fellow in Ó³»­´«Ã½'s Gene Regulation Observatory, Bo Xia studies how the structure of the genome determines cell fate in both health and disease.

Growing up in the Sichuan province in China, Bo Xia was curious about the nature he saw around him: the creeks that ran through his hometown and its many animals and insects. In school, he loved imagining how molecules made up materials and whole organisms. He read about graphene in a textbook and was intrigued by how the material’s structure — a single layer of carbon atoms patterned like a honeycomb — granted it special electronic properties.

As an undergraduate, Xia studied biology and chemistry and began to think about biomolecules as chemical structures: the double helix of DNA, the intricate three-dimensional shapes of proteins, each with properties resulting directly from their forms. In the lab of Chengqi Yi at Peking University, he developed to analyze chemical modifications of DNA and knew he wanted to dedicate his career to research. 

Xia went on to earn a PhD in stem cell biology from the New York University Grossman School of Medicine. In one of his PhD projects, Xia uncovered elements in the genome that explain over the course of evolution. In 2022, he joined the Ó³»­´«Ã½ as a fellow in the Gene Regulation Observatory and a junior fellow of the Society of Fellows at Harvard.

Today, Xia runs a dedicated to untangling how the structure of the genome regulates gene expression and determines cell fates in health and disease. He and his team develop both AI tools to model the relationships between genomic components as well as experimental technologies to test the predictions of their AI models. Predictive AI technologies, Xia says, could help researchers pinpoint critical molecular mechanisms underlying fundamental biological processes in human health and disease.

We spoke with Xia about how technology can enable fundamental discoveries in biomedical research and how he copes with setbacks in this #WhyIScience Q&A.

What does your lab study?

We explore how the genome encodes gene expression across time and space — and how that regulation changes across cell types and disease states. 

For over 20 years since the Human Genome Project, scientists have developed many experimental technologies for measuring genomic features and perturbing them to understand causal relationships. However, it remains very challenging to apply these experimental technologies to investigate cell fate dynamics in complex physiological conditions such as in development, cancer progression, or neurodegenerative diseases. Genomic approaches to study these complex questions are in many cases impossible. In my group, we are developing predictive genomics tools, where we use machine learning models to predict all possible genomic features so that we have a holistic view of gene regulatory networks and how gene regulation is changing over time and space. 

While these predictive genomics technologies are generally applicable in many fields, we’re particularly interested in their applications in regenerative biology. Across evolution, we’ve evolved many new traits, but we also lose many things. For example, we have largely lost regenerative functions. When you get a wound in the skin, many species can heal through a regenerative process, but we can only grow a scar at the wounded site, a process called fibrotic healing. And so we want to know if we can use our technological innovations to understand the biology of regeneration loss and, based on that knowledge, restore the regenerative function in humans through genetic manipulation.

What kinds of technology is your lab developing?

To understand complex physiological functions, we want to develop predictive genomics tools that learn from measurable genomic information such as DNA sequences and chromatin accessibility to predict complex information — such as chromatin-associated protein binding profiles or fine-scale DNA interactions — to understand what kind of regulatory mechanisms are driving cell fate changes. In my lab, we have recently developed two major tools. One, called , can predict cell-type-specific chromatin interactions in unseen cell types, which enables virtual experimenting and genetic screening. That has led to some exciting discoveries of how the genome is organized hierarchically.

We're also developing a tool that can predict global chromatin-associated protein binding profiles across the genome. The human genome has around 2,500 genes encoding chromatin-associated proteins such as transcription factors. We know little about the vast majority of them due to experimental limitations for measuring their binding profiles and activities on the genome. I hope our technologies can shift this paradigm by predicting such multifaceted genomic features to accelerate the pace of discoveries.

How do you cope with setbacks in research?

My graduate school experience didn’t go as smoothly as I hoped. I conceived of and worked hard on a project that ended up getting scooped. I had to pivot to a completely new direction — one that was much more computational. Fortunately, I had incredible mentors who supported not just my research, but also my growth during that challenging time. It turned out that my setback became a new opportunity — I picked up new skills in computational biology, explored new ideas, and ultimately made some very exciting discoveries during graduate school. 

As a new PI, I say to my group that research or a career as a scientist is a marathon, not a sprint. Of course, along the journey, we will find friends and mentors who can help us solve the problem, who can help us accelerate and run with us. I found that super helpful during graduate school.

But in general, it's possible that at some point you're running slightly slower, and that's fine. That's just a tiny bit of a career. The only thing that should not change as a scientist is the motivation to strive for new discoveries. 

What are you most proud of?

I was proud of our tail loss work — people have thought about this question for centuries. There was a chapter in Charles Darwin’s book The Descent of Man about it, and 150 years later, we have some answers. But I won't say that's my proudest work — I am always more proud of our next discovery. We are building technologies that can accelerate fundamental discoveries in gene regulation and cell fate determination. I look forward to many new discoveries coming in the next couple of years.