Mobile Menu

The role of EMR and ML in drug development

Here, we summarise a chapter in Artificial Intelligence in Oncology Drug Discovery and Development, which explored the role of electronic medical records (EMRs) and machine learning (ML) approaches in drug development.

EMR and ML

The applications of artificial intelligence, particularly ML, have skyrocketed in the past decade. Machine-learning researchers have gained access to large quantities of high-quality medical data – the most important being EMRs. EMRs contain valuable real-world data (RWD) on patient, clinical and genomic data as well as clinical notes. Today, providers produce EMRs with the aim of providing a centralised source of medical data in order to increase care coordination. EMRs are gaining a lot of traction in the biomedical space as they offer the opportunity to extract important conclusions. Researchers have been particularly focussed on the use of ML methods for extracting data from EMRs.

Nonetheless, there are several challenges associated with EMRs. These include: lack of standardised formatting, human errors in EMRs and underrepresentation of minorities due to lack of EMRs within particular communities.

Applications of EMRs

ML techniques applied to EMRs are influencing several key areas of biomedical research and drug discovery. These include:

Phenotype-genotype associations

Diagnoses are recorded extensively in EMRs, making them a rich source of phenotype-related data. These records, although not currently common practice, can also be linked to genomic data. Various standardised codes can be used in EMRs that can successfully identify phenotypes. There is an ongoing trend towards using deep learning frameworks to identity phenotype-genotype associations. Nonetheless, EMR application through ML models still faces challenges, e.g. missing data, which require further work to combat their shortcomings.

Clinical trials

In recent years, clinical research informatics has emerged as a new field of biomedical translational research. It involves the use of informatics methods to collect, store, process and analyse real-world clinical data. EMRs can help reduce the cost and time of clinical trials by automating patient recruitment, extending randomised control trials and enhancing retrospective cohort studies. The use of ML approaches in this case largely depends on the quality of the training data.


New drugs can enter the market with unknown adverse drug events (ADEs) that are not seen in clinical trials. Therefore, pharmaceutical companies must undertake pharmacovigilance to continually track the effects of their drugs after deployment. This means that clinical data on post-market drug effects is valuable to pharma companies. RWD on pharmaceutical products and their effects are richly logged within patient EMRs. ML methods have been used to automate ADE detection and offer tools for novel, automated pharmacovigilance analytics.

Drug repurposing

EMR data can be mined for drug repurposing indications. This method requires big data and an increasing amount of digitised medical records (such as EHRs) to me made available. This topic is becoming increasingly popular as ML tools have been developed and cloud computing services have become more accessible.


Over the past decade, EMRs have become a vital data source in advancing healthcare. Mining of EMR data has several benefits, including improved trial recruitment and drug surveillance. While there are no immediate plans to fully automate drug research pipelines or independently diagnose and provide subsequent healthcare procedures. There is, however, a lot of evidence demonstrating the vital role that EMRs and ML will play in drug research. 

Image credit: By vectorjuice –

Share this article