Bioinformatics,AI & ML,Biology

Implementation of Machine Learning in Drug Development

9 months ago By Charlotte Di Salvo
  • Share on

  • Facebook
  • Linkedin
Significant recent investment in computational technology has seen a number of new innovations arise in drug discovery – perhaps most notable machine learning (ML). By 2022, it is expected that AI technology will contribute $2.199 billion to pharma’s revenue, with popularity growing across the pharmaceutical industry. Target identification, validation and drug discovery are some of the areas in drug development in which machine learning has shown its potential. 

For daily articles on the latest pharma trends and innovations, as well as interviews with leading experts and in-depth industry White Papers, subscribe to PharmaFeatures.com.

Introduction 

The pipeline from drug discovery to development to approval is a complex and lengthy process. However ML is beginning to show innovations in all stages of drug development. Target validation, identification of prognostic biomarkers and analysis of digital pathology data in clinical trials are some of the opportunities in which ML can be implemented. 

There are two main techniques used to apply ML: supervised and unsupervised learning. Unsupervised learning is a type of algorithm that learns patterns from untagged data. Supervised learning on the other hand, is a type of algorithm formed from labeled training data which consists of a set of training examples.

Supervised learning methods have been used to predict future values of data categories or continuous variables. Unsupervised learning is primarily used for exploratory purposes in the development of models to enable data clustering in a format not specified by the user. This particular technique helps to identify hidden patterns within input data, whereas supervised learning methods predict future outputs based on a trained model of known input and output data. According to a 2020 review, supervised learning techniques such as Support Vector Machines, deep learning and regression methods have already been applied to biomedical challenges in the last decade.

Applications in drug development 

Target identification and validation

The identification and validation of a therapeutic target requires the analysis of vast datasets. Genetic screening and high-content imaging are examples of techniques that produce large datasets that can be exploited for early target identification and validation. However analysis of such data requires appropriate mathematical methods to construct valid statistical models – this is where ML can be exploited. 

As early as 2010, ML was applied in a study for target validation in the form of a “decision tree-based meta classifier”. In this study, the ML platform was proposed as a computational approach to predicting morbid and druggable genes. Morbid genes with mutations are associated with causing hereditary human disease. The tree-based meta-classifier was used to predict targets on a genome-wide scale. It managed to correctly recover “65% of known morbid genes with a precision of 66% and correctly recovered 78% of known druggable genes with a precision of 75%”. The ability of ML to reliably predict specific genes on a genome-wide scale is a huge step forward in further optimising target identification. Prediction of therapeutic targets saves time and resources for pharma companies and potentially utilise the mathematical approach to predict more reliable targets. 

Drug discovery

The Generative Adversarial Network (GAN) is an example of a recent innovation in deep learning for drug discovery. Deep learning is a specialised area of ML that attempts to model abstraction from large-scale data using multi-layered deep neural networks (DNNs). Abstraction is a computer science term that refers to the process of filtering out irrelevant data in order to focus on the desired information. 

As an unsupervised ML method, GAN has proven to address the challenges of supervised ML, primarily the training of large data sets which is often expensive and time-consuming. In a 2017 study, GAN-based frameworks were used to develop and identify novel compounds for anticancer therapy with chemical and biological datasets

This study emphasised how the productivity of pharmaceutical research is limited by inefficient early lead discovery processes. It also highlighted how in silico-based approaches like deep learning models can generate reliable data at a reduced cost and time scale relative to current screening methods. 

Computational pathology 

In research, a pathologist interprets the presentation of tissue/cells within a glass slide. The spatial context between cells, size and general cellular structure can be indicators of changes with drug interaction. Computational pathology is becoming an important part of drug development. It has been suggested that this method could allow pharmaceutical companies to discover novel biomarkers and generate them in a more precise, reproducible and high-throughput manner

ML allows for high-throughput generation of features for thousands of cells, which is an impossible task for pathologists. Immuno-oncology is a particular therapeutic area which has benefitted from using computational pathology. A 2017 study found that computational analysis of tumour-adjacent benign tissue in prostate cancer revealed information typically ignored by pathologists but has been associated with progression-free survival.

Ongoing challenges in adopting AI/ML

One of the main concerns with ML predictions is overfitting or underfitting. Overfitting is described as a model which consists of “lower quality information/technique but generates higher quality performance. In contrast, underfitting models fail to recognize the data sets’ underlying trend and generalize the new data inputted”. Both errors produce inaccurate results which compromise the reliability of predicted drug targets. Increasing the sample size and cross-validation are often used to address these problems. Cross validation is a technique that uses independent data sets to estimate the accuracy of ML algorithms’ models. 

Another challenge for the pharmaceutical industry is the lack of personnel to operate AI/ML-based platforms. Furthermore, there is often skepticism about the quality of data generated by AI. Small organisations are often limited in their budget so cannot afford to invest in AI/ML technology. 

Despite the improvements needed to refine ML applications, the potential they bring to drug development is significant. In addition to reducing human error, the automation of ML software can analyse data from many sources more accurately and in a shorter period of time. The advancement of AI and ML will continue to reduce the challenges faced by the pharmaceutical industry.

To discuss these topics further with sector experts, and to ensure you remain up-to-date on the latest in clinical development, sign up for Proventa International’s Bioinformatics Strategy Meeting, set for 1 July 2021.

Charlotte Di Salvo, Junior Medical Writer
Proventa International

More news

The Opportunities and Challenges of CNS Drug Delivery: An Interview with Marcie Glicksman, Enclear Therapies

Developing therapeutics that treat diseases of the brain especially has been an uphill struggle in terms of optimising drug delivery systems. While some pharma companies attempt to overcome hurdles like the blood-brain barrier, others are looking beyond this strategy to...

5 months ago
Bioinformatics,AI & ML,Biology

The Opportunities and Challenges of CNS Drug Delivery: An Interview with Marcie Glicksman, Enclear Therapies

Developing therapeutics that treat diseases of the brain especially has been an uphill struggle in terms of optimising drug delivery systems. While some pharma companies attempt to overcome hurdles like the blood-brain barrier, others are looking beyond this strategy to...

5 months ago

RNA Therapeutics: The World’s First mRNA Malaria Vaccine

Malaria represents one of the greatest unmet clinical needs in the world. In the last decade, research has been ongoing to develop effective preventative methods and effective treatment to reduce the risk of severe illness, which typically occurs in children....

5 months ago
Bioinformatics,AI & ML,Biology

RNA Therapeutics: The World’s First mRNA Malaria Vaccine

Malaria represents one of the greatest unmet clinical needs in the world. In the last decade, research has been ongoing to develop effective preventative methods and effective treatment to reduce the risk of severe illness, which typically occurs in children....

5 months ago

DeepMind’s AI Predicts Structures for More Than 350,000 Proteins

In 2003, researchers sequenced approximately 92% of the human genome, a huge achievement and very recently researchers have completed the entire process. Now, the latest innovation in AI technology has predicted the structure of nearly the entire human proteome. The...

6 months ago
Bioinformatics,AI & ML,Biology

DeepMind’s AI Predicts Structures for More Than 350,000 Proteins

In 2003, researchers sequenced approximately 92% of the human genome, a huge achievement and very recently researchers have completed the entire process. Now, the latest innovation in AI technology has predicted the structure of nearly the entire human proteome. The...

6 months ago
Working With us

Interested?
Reserve your space