Groundbreaking study reveals hidden complexity in human genetics

In This Story

Sometimes, in genetics, two wrongs do make a right. A research team recently showed that two harmful genetic variants, when occurring together in a gene, can restore function—proving a decades-old hypothesis originally proposed by Nobel laureate Francis Crick. Their study, published in the Proceedings of the National Academy of Sciences (PNAS), not only experimentally validated this theory but also introduced a powerful artificial intelligence (AI)-driven approach to genetic interpretation led by George Mason University researchers.

The project began when Aimée Dudley, a geneticist at the Pacific Northwest Research Institute (PNRI), approached George Mason University Chief AI Officer Amarda Shehu after following her lab’s work on frontier AI models for predicting the functional impact of genetic variation. That conversation sparked a collaboration that married PNRI’s experimental expertise with George Mason’s computational innovation to discover some surprising ways variant combinations can shape human health.

The problem

Every year one in three Americans is diagnosed with a genetic disorder. Symptoms manifest in infancy for about 70% of individuals. Sadly, 35% die before the age of 5. Advancements in clinical genomics offer hope to better understand and possibly treat these disorders.

“High-throughput genomic screening has been a wonderful feat for humanity,” said Shehu, “but one of its side effects is that it has produced massive amounts of data, outpacing our ability to interpret what that data means for health and disease.”

Research in the Shehu lab has for years focused on building frontier AI models to advance genetic interpretation, but all data available link only isolated, single variants to measured functional activity. Because each person's genome contains billions of base pairs, with about five million variants existing between two individuals’ genomes, looking at one variant at a time rather than combinations of variants could only reveal so much.

“It looked like we had hit a wall,” Shehu said, “that is, until Dr. Dudley contacted my lab more than a year ago.”

On the left, a 3D landscape derived from variant–variant measurements shows distinct functional regions emerging from pairwise interactions. On the right, these regions map onto a multimeric protein structure, where variants in separate spatial zones can be sequestered into different active sites, allowing functional recovery. This visualization captures the structural logic underlying positive epistasis and illustrates how AI-enabled analysis links genetic variation to protein function, a key, groundbreaking result Dudley and Shehu's labs published in the Proceedings of the National Academy of Sciences. Image provided.

The proof

Dudley’s lab was convinced that the key was to account for variant combinations in a gene, also called epistasis. They measured functional effects of variant combinations in the DNA of a key enzyme, argininosuccinate lyase (ASL), a lack of which results in urea cycle disorder, a rare but devastating condition.

The researchers tested thousands of variant combinations that resulted in no enzyme activity when on their own and found that a significant portion of them had high levels of enzyme activity when in combination with each other. In other words, two defective variants, when combined, can recover function.

“This was the most puzzling thing that I could not believe when Dr. Dudley showed it to me. Sometimes in biology, zero plus zero equals 100%,” said Shehu.

Shehu said that Crick, who shared the Nobel Prize in Physiology or Medicine 1962 with James Dewey Watson and Maurice Wilkins for their discoveries concerning the molecular structure of DNA, had hypothesized this could happen.

“Crick had a fancy word for it—variant sequestration,” said Shehu, “But until Dr. Dudley, no one had demonstrated it.”

The progress

Once Dudley’s lab confirmed the phenomenon experimentally, George Mason researchers turned to AI to see if it could predict similar effects across other genes. Using the ASL data from Dudley’s lab, George Mason computer science PhD student Anowarul Kabir developed a machine learning model to predict the effects of variant combinations. Then, he applied the model to a structurally similar but evolutionarily distinct protein, fumarase (FH). The algorithm achieved 99.6 percent accuracy in predicting regained function within ASL and 91 percent accuracy in FH.

“The really cool thing about this,” said Shehu, “is that the model learned both sequence and structural patterns and was able to transfer its knowledge to another gene.”

This breakthrough suggests that with experimental data from a few genes, AI can help scale variant effect prediction to a broad set of genes. The PNAS publication estimates that as many as 4% of the genes in the human genome could have the same types of effects seen for ASL and FH.

The paradigm shift

This breakthrough marks a paradigm shift in clinical genomics for precision medicine. By considering variant combinations rather than isolated, single variants, clinicians can deliver faster, more accurate diagnoses and life-saving interventions for families facing rare diseases. They can also prioritize therapeutic treatments based on specific epistatic profiles of patients or clinical trial participants.

“Clinical genomics has been stuck in a rut for decades. We’ve shown that you need to look at combinations of variants to fully understand their impact,” said Shehu. “Our AI model expands coverage from one gene to another, accelerating interpretation and bringing us closer to true precision medicine.”

Topics

DNA

Machine Learning

Machine Learning in Health Care

Research

Artificial Intelligence