Biology,News

DeepMind’s AI Predicts Structures for More Than 350,000 Proteins

2 months ago By Charlotte Di Salvo
  • Share on

  • Facebook
  • Linkedin
In 2003, researchers sequenced approximately 92% of the human genome, a huge achievement and very recently researchers have completed the entire process. Now, the latest innovation in AI technology has predicted the structure of nearly the entire human proteome. The impact of this on drug discovery advancements could open the door to previously undruggable targets and a host of therapeutic breakthroughs. 

The AI machinery responsible for this historical news is known as AlphaFold, developed by Google’s sister company DeepMind in London. The human genome codes for the human proteome, the full complement of proteins expressed by an organism. However, up until now, only a third of the 3D structures for the proteome have been determined. 

Now, AlphaFold has been able to characterise the structure of more than 35,000 proteins (which vary in accuracy) available through a public database. According to a Nature publication, the database is said to grow to 130 million structures by the end of 2021 and has aimed to predict the structure of every protein in humans as well as 20 model organisms. 

The DeepMind Programme has proven its value after outperforming approx 100 other teams in a protein-structure prediction challenge called CASP – Critical Assessment of Structure Prediction.

The protein folding problem

Proteins constitute one of the key areas of focus for therapeutic targets, especially so in the last few years, with research investigating protein-protein interactions and targeted protein degradation. Unfortunately, this research has been limited in progress by something known as the ‘protein folding problem’.

A historical hypothesis inferred that, in theory, a protein’s amino acid sequence should fully determine its structure. The challenge is that it has been impossible to characterise protein structure due to the vast number of confirmations it could fold into before settling into the final 3D structure. 

A folded protein can be thought of as a “spatial graph” which is important for understanding the physical interactions within proteins as well as their evolutionary history. According to DeepMind, AlphaFold works by creating an “attention-based neural network system, trained end-to-end, that attempts to interpret the structure of this graph. It uses evolutionarily related sequences, multiple sequence alignment, and a representation of amino acid residue pairs to refine this graph.”

A neural network system is a form of deep learning, which is developed from a branch of AI called machine learning (ML). Deep learning is a specialised area of ML that attempts to model abstraction from large-scale data using multi-layered deep neural networks (DNNs). Abstraction is a computer science term that refers to the process of filtering out irrelevant data in order to focus on the desired information. The neural network structure aims to mimic how the human brain calculates problems and analyses data through a set of algorithms.

Through this complex process, AlphaFold develops strong predictions of the underlying physical protein structure, which it can determine highly accurate structures of in the space of a few days. 

AlphaFold has been trained upon publicly available data of approximately 170,000 protein structures and large databases containing the sequences of proteins with unknown structures.  

How will this contribute to therapeutic advancements?

DeepMind is optimistic about the impact of AlphaFold on biological research, especially in terms of understanding disease pathology. Understanding the 3D structure of a protein is so important as it plays a critical role in their function and contribution to physiological changes in the body. 

Genes determine the amino acid sequence which determines the final structure of the protein – hence, an error in the genetic code may result in the malformation of a protein, causing disease or death. 

The link between protein malformation and disease is not a new concept, however, targeting proteins has been an uphill struggle due to the fact that the final 3D structure of many proteins remains unknown. Knowing the genetic code of a protein is not enough – the structure is the key for drug targeting. 

It’s not to say however that scientists have failed to determine protein structure – experimental techniques like X-Ray crystallography have been used over the last few decades to successfully determine protein shape. Unfortunately, these methods can take years to perform, cost thousands of dollars per protein structure and depend a lot on trial and error. 

In terms of contributing to therapeutic advancements, DeepLearning could help accelerate research by predicting a protein’s shape computationally from its genetic code alone, rather than timely laborious lab work involving techniques like X-Ray crystallography.

Even more interesting is that “some of the regions that AlphaFold predicted with low confidence match up with those that biologists suspect are disordered” – a quote from the Head of AI for science at DeepMind. This is the first step forward for researchers to begin to understand how protein structure contributes to specific diseases, by understanding more about the complex structure in detail. In other words, AlphaFold could help to identify proteins which have malfunction and provide more information about how they interact. 

These insights would no doubt contribute to more precise targeting in drug development, should researchers identify the structure of malfunctioned proteins and target specific regions of interest. 

Recent applications: COVID-19

In early 2020, AlphaFold predicted several protein structures of SARS-CoV-2 virus – ORF3a and ORF8 – which were previously unknown. Recent work by experimentalists have confirmed the structure of both proteins, supporting the accuracy of AlphaFold’s protein predictions. This is a significant achievement especially given the challenging nature of the proteins with very few related sequences for the AI system to utilise. 

Charlotte Di Salvo, Lead Medical Writer
PharmaFeatures

For more articles covering the pharmaceutical industry, clinical research and academia, visit our content site PharmaFeatures.

More news

The Opportunities and Challenges of CNS Drug Delivery: An Interview with Marcie Glicksman, Enclear Therapies

Developing therapeutics that treat diseases of the brain especially has been an uphill struggle in terms of optimising drug delivery systems. While some pharma companies attempt to overcome hurdles like the blood-brain barrier, others are looking beyond this strategy to...

1 month ago
Biology,News

The Opportunities and Challenges of CNS Drug Delivery: An Interview with Marcie Glicksman, Enclear Therapies

Developing therapeutics that treat diseases of the brain especially has been an uphill struggle in terms of optimising drug delivery systems. While some pharma companies attempt to overcome hurdles like the blood-brain barrier, others are looking beyond this strategy to...

1 month ago

RNA Therapeutics: The World’s First mRNA Malaria Vaccine

Malaria represents one of the greatest unmet clinical needs in the world. In the last decade, research has been ongoing to develop effective preventative methods and effective treatment to reduce the risk of severe illness, which typically occurs in children....

1 month ago
Biology,News

RNA Therapeutics: The World’s First mRNA Malaria Vaccine

Malaria represents one of the greatest unmet clinical needs in the world. In the last decade, research has been ongoing to develop effective preventative methods and effective treatment to reduce the risk of severe illness, which typically occurs in children....

1 month ago

Rapid COVID Antibody Test Can Reveal Immune Status in 15 Minutes

As of July 2021, the number of hospitalisations and deaths due to COVID-19 have significantly dropped since the second peak around January this year. In addition to restrictions, the greatest contributor to this drop has been the rapid vaccine uptake...

2 months ago
Biology,News

Rapid COVID Antibody Test Can Reveal Immune Status in 15 Minutes

As of July 2021, the number of hospitalisations and deaths due to COVID-19 have significantly dropped since the second peak around January this year. In addition to restrictions, the greatest contributor to this drop has been the rapid vaccine uptake...

2 months ago
Working With us

Interested?
Reserve your space