Adobe Stock

One billion-year-old rules of protein stability revealed

Huge experiment reveals rules governing protein stability, paving way to faster drug development and enzyme design

Email newsletter

News and blog updates

Sign up

The rules required to make proteins stable are much more simple than previously thought, new research has found.

Published today (24 July) in the journal Science, a new study has taken an important step towards learning the rules of protein stability, which may help protein engineers design better medicines and greener catalysts.

Researchers from the Centre for Genomic Regulation (CRG) in Barcelona and the Wellcome Sanger Institute used machine-learning algorithms to investigate how evolution chooses the handful of amino acid combinations that result in proteins that fold and stay stable.

Proteins are life’s molecular workhorses, doing everything from turning sunlight into food to fighting viruses. They are built from 20 different types of amino acid molecules, so even a small protein made of 60 amino acids in length can, in theory, be constructed in a quinquavigintillion, or 10⁷⁸, different ways. That’s about as many atoms there are in the entire universe.

Proteins have a core that keeps the structure from collapsing, while the surface does most of the work, such as binding with other molecules. For decades, biologists assumed that altering the core was like removing a load-bearing wall: one wrong move and the whole structure collapses. Because buried amino acids are packed tightly, it seemed logical that any alteration can force neighbouring amino acids to shift, resulting in unpredictable domino effects that ripple throughout the protein.

With this classical picture of protein stability, most changes to the building blocks of a protein would threaten to knock the entire structure out of shape. Given the sheer number of combinations possible, the odds of evolution stumbling onto a safe route to create new proteins seems very small.

A new study turns this idea on its head. Researchers at the Centre for Genomic Regulation (CRG) in Barcelona and the Wellcome Sanger Institute, studied a human protein domain – the functional bit of a protein – called FYN-SH3, making hundreds of thousands of variants and testing which ones still folded and worked.

The experiments revealed that SH3 retained its shape and function across thousands of different core and surface combinations. Only a few true, load-bearing amino acids existed in the protein’s core.

The team used the large amount of data generated by their experiments to test whether learning the rules from one protein could help explain the evolution of all related proteins that exist in nature. They fed the data into a machine-learning algorithm, which helped them create a tool that can predict whether an SH3 sequence will stay stable.

SH3 domains have been diversifying since early multicellular life, roughly one billion years ago. The researchers compared their model against 51,159 natural SH3 sequences found in public databases spanning the entire tree of life, including bacteria, plants, insects and humans. The algorithm correctly flagged almost all SH3 domains as stable, even when a test sequence shared less than a quarter of the sequence with the human version.

The field of protein engineering currently relies on companies screening thousands of protein variants with minimal changes, inching forward a few changes at a time and making the design of new enzymes, drugs and vaccines slow and expensive.

The confirmation that protein stability follows simpler rules than previously thought can reduce the trial-and-error phase for protein design, saving significant time and effort for developing proteins with medical or industrial applications, such as greener catalysts or longer-lasting medicines.

For example, therapeutic enzymes often fail because their surfaces trigger immune flare-ups. Resurfacing these proteins is labour intensive, requiring lots of trial and error to avoid the scaffold from collapsing and disrupting a promising design. Now, protein engineers can propose bolder designs, including dozens of simultaneous changes, on computers and walk into the lab already knowing which variants are most likely to survive both folding and functional tests.

“Our data challenges the dogma of proteins being a delicate house of cards. The physical rules governing their stability is more like Lego than Jenga, where a change to one brick threatening to bring the entire structure down is a rare, and crucially, predictable phenomenon. Evolution didn’t have to sift through an entire universe of sequences. Instead, the biochemical laws of folding create a vast, forgiving landscape for natural selection.”

Dr Albert Escobedo, first author of the study and postdoctoral researcher at the Centre for Genomic Regulation

“The ability to predict and model protein evolution opens the door to designing biology at industrial speed, challenging the conservative pacing of protein engineering.”

Professor Ben Lehner, lead author of the study at the Centre for Genomic Regulation (CRG) and Head of Generative and Synthetic Genomics at the Wellcome Sanger Institute

More information

Publication

Escobedo, A. et al. (2025). ‘Genetics, energetics, and allostery in proteins with randomized cores and surfaces’. Science. DOI: science.org/doi/10.1126/science.adq3948

Funding

This study was funded by the European Research Council, the Spanish Ministry of Science and Innovation, Bettencourt Schueller Foundation, the AXA Research Fund, the Agència de Gestió d’Ajuts Universitaris i de Recerca, the CERCA Program by the Generalitat de Catalunya and Wellcome.