Mobile Menu

Applications of AI in drug development

The FDA has been actively encouraging the use of real-world data (RWD) in drug development. Meanwhile, artificial intelligence (AI) methods are being increasingly harnessed across various stages of the drug development process. We summarise a recent paper, published in Drug Discovery Today, that provided an overview of the use of both AI and RWD in the drug development process.

Drug development

Drug development, broadly speaking, is the process of bringing new drugs into clinical practice. It includes all stages from basic research of finding a molecular target to large-scale clinical studies. During this process, chemical entities are identified and thoroughly tested. However, the entire process is lengthy and costly. Therefore, strategies that can facilitate and accelerate this process are of particular interest.

Recently, the FDA has been actively promoting the use of RWD in this process. The information generated from RWD, such as EHRs, can provide important real-world evidence to inform therapeutic development, research outcomes, patient care, safety surveillance and comparative effectiveness studies. Most importantly, the use of RWD allows experts to answer questions more efficiently, saving time and money. Simultaneously, researchers have been increasingly applying AI, particularly machine learning and deep learning, to real-world settings. Researchers have widely used AI throughout the drug development pipeline, including to identify novel targets and develop new biomarkers.

Current trends

In this study, researchers conducted a rapid review summarising published articles related to the intersection of AI, RWD and drug development over the past 20 years. Their aim was to specifically identify current trends in using AI and RWD in drug development studies, highlighting current challenges and opportunities.

The team identified the following categories as the most popular AI and RWD applications in drug development:

  • Adverse event detection: This involves mining clinical notes and structured EHR data using natural language processing (NLP).
  • Recruitment optimisation: This involves electronically recruiting patients through EHR and also identifying eligible populations using NLP.
  • Clinical drug repurposing: This involves using EHR data to identify patient cohorts and medication information, which researchers can then use to assess whether a drug can be repurposed.


One of the major challenges of using AI and RWD in drug development is the quality of the data of many RWD sources. The heterogeneity of information from sources can make it difficult to extract the same information consistently. Other data-inconsistency issues, such as missing data and selection bias, also present challenges.

In addition, the researchers found that most studies focussed on prediction or classification tasks and often overemphasised model performance rather than learning the causal effects.

Another challenge is the transportability and interpretability of these studies. External validation using independent sources is important. However, these studies are often difficult to execute. Reasons for this include the ethical and legal issues of sharing individual-level clinical data and the lack of standardisation across different data sources.

Future directions

Clinical trial simulation (CTS) studies use computerised simulation methods on virtual populations to test different trial designs before conducting actual clinical trials. An emerging trend is the incorporation of RWD into CTS to enable a trial that can simulate its virtual populations more realistically.

Another emerging trend is the linkage of EHRs with other data sources, such as biobanking data, to help study drug-phenotype and drug-gene interactions. For example, researchers from the Vanderbilt Electronic Systems for Pharmacogenomic Assessment (VESPA) Project demonstrated that EHR-based biobanks could be cost-effective tools for establishing disease and drug associations.

With ongoing advancements in AI, emerging models are becoming increasingly capable of handling longitudinal and heterogenous RWD which can address current challenges. In addition, the raise of causal AI will provide new research opportunities in drug development that can benefit from both AI and RWD.

Image credit: By –

More on these topics

AI / Drug Development / Real World Data

Share this article