Most mutations which cause disease by swapping one amino acid out for another do so by making the protein less stable, according to a massive study of human protein variants published today in the journal Nature. Unstable proteins are more likely to misfold and degrade, causing them to stop working or accumulate in harmful amounts inside cells.

The work helps explain why minimal changes in the human genome, also known as missense mutations, cause disease at the molecular level. The researchers discovered that protein instability is one of the main drivers of heritable cataract formation, and also contributes to different types of neurological, developmental and muscle-wasting diseases.

Researchers at the Centre for Genomic Regulation (CRG) in Barcelona and BGI in Shenzhen studied 621 well-known disease-causing missense mutations. Three in five (61%) of these mutations caused a detectable decrease in protein stability.

The study looked at some disease-causing mutations more closely. For example, beta-gamma crystallins are a family of proteins essential for maintaining lens clarity in the human eye. They found that 72% (13 out of 18) of mutations linked to cataract formation destabilise crystallin proteins, making the proteins more likely to clump together and form opaque regions in the lens.

The study also directly linked protein instability to the development of reducing body myopathy, a rare condition which causes muscle weakness and wasting, as well as Ankyloblepharon-ectodermal defects-clefting (AEC) Syndrome, a condition characterised by the development of a cleft palate and other developmental symptoms.

However, some disease-causing mutations did not destabilise proteins and shed light on alternative molecular mechanisms at play.

Rett Syndrome is a neurological disorder which causes severe cognitive and physical impairments. It is caused by mutations in the MECP2 gene, which produces a protein responsible for regulating gene expression in the brain. The study found that many mutations in MECP2 do not destabilise the protein but are instead found in regions which affect how MECP2 binds to DNA to regulate other genes. This loss of function could be disrupting brain development and function.

"We reveal, at unprecedented scale, how mutations cause disease at the molecular level" says Dr. Antoni Beltran, first author of the study and researcher at the Centre for Genomic Regulation (CRG) in Barcelona. "By distinguishing whether a mutation destabilises a protein or alters its function without affecting stability, we can tailor more precise treatment strategies. This could mean the difference between developing drugs that stabilise a protein versus those that inhibit a harmful activity. It's a significant step toward personalised medicine."

The study also found that the way mutations cause disease often relates to whether the disease is recessive or dominant. Dominant genetic disorders occur when a single copy of a mutated gene is enough to cause the disease, even if the other copy is normal, while recessive conditions occur when an individual inherits two copies of a mutated gene, one from each parent.

Mutations causing recessive disorders were more likely to destabilise proteins, while mutations causing dominant disorders often affected other aspects of protein function, such as interactions with DNA or other proteins, rather than just stability.

For example, the study found that a recessive mutation in the CRX protein, which is important for eye function, destabilises the protein significantly, which could be causing heritable retinal dystrophies because the lack of a stable, functional protein impairs normal vision. However, two different types of dominant mutations meant the protein remained stable but functioned improperly anyway, causing retinal disease even though the protein's structure is intact.

The discoveries were possible thanks to the creation of Human Domainome 1, an enormous library of protein variants. The catalogue includes more than half a million mutations across 522 human protein domains, the bits of a protein which determine its function. It is the largest catalogue of human protein domain variants to date.

Protein domains are specific regions which can fold into a stable structure and perform a job independently of the rest of the protein. Human Domainome 1 was created by systematically changing each amino acid in these domains to every other possible amino acid, creating a catalogue of all possible mutations.

The impact of these mutations on protein stability was discovered by introducing mutated protein domains into yeast cells. The transformed yeast could only produce one type of mutated protein domain, and cultures were grown in test tubes under conditions which linked the stability of the protein to the growth of the yeast. If a mutated protein was stable, the yeast cell would grow well. If the protein was unstable, the yeast cell's growth would be poor.

Using a special technique, the researchers ensured only the yeast cells producing stable proteins could survive and multiply. By comparing the frequency of each mutation before and after the yeast growth, they determined which mutations led to stable proteins and which caused instability.

Though Human Domainome 1 is around 4.5 times bigger than previous libraries of protein variants, it still only covers 2.5% of known human proteins. As researchers increase the size of the catalogue, the exact contribution of disease-causing mutations to protein instability will become increasingly clear.

In the meantime, researchers can use the information from the 522 protein domains to extrapolate to proteins that are similar. This is because mutations often have similar effects on proteins that are structurally or functionally related. By analysing a diverse set of protein domains, the researchers discovered patterns in how mutations affect protein stability that are consistent across related proteins.

"Essentially, this means that data from one protein domain can help predict how mutations will impact other proteins within the same family or with similar structures. The 'rules' from these 522 domains are enough to help us make educated predictions about many more proteins than there are in the catalogue," explains ICREA Research Professor Ben Lehner, corresponding author of the study with dual affiliation at the Centre for Genomic Regulation and the Wellcome Sanger Institute.

The study has limitations. The researchers examined protein domains in isolation rather than within full-length proteins. In living organisms, proteins interact with other parts of the protein and with other molecules in the cell. The study might not fully capture how mutations affect proteins in their natural habitat inside human cells. The researchers plan on overcoming this by studying mutations in longer protein domains, and eventually, full-length proteins.

"Ultimately, we want to map the effects of every possible mutation on every human protein. It's an ambitious endeavour, and one that can transform precision medicine," concludes Dr. Lehner.