Mobile Menu

The Promise of AI for Drug-Induced Liver Injury Prediction

Drug-induced liver injury is a common reason for the withdrawal of a drug from the market. Therefore, early assessment of drug-induced liver injury risk is an essential part of drug development. However, prediction prior to the commencement of clinical trials is challenging due to the complex factors that give rise to liver damage. This study aimed to assess existing AI approaches for drug-induced liver injury prediction and discussed the issues that arise from the limited availability of data.

Drug-induced liver injury

Drug-induced liver injury is a common cause of acute liver failure and is one of the main reasons for clinical trials to fail and for drugs withdrawal of drugs from the market. Researchers define hepatotoxicity as injury or damage to the liver as a result of an adverse drug reaction. Prediction of hepatotoxicity is possible in some cases. For example, the drug acetaminophen (more commonly known as paracetamol) causes damage to the liver when the recommended dose is exceeded. However, researchers consider other drug-induced liver injury events as idiosyncratic, since they are rare and difficult to predict. Researchers have turned to the investigation of biomarkers and the development of AI methods to improve the understanding of drug-induced liver injury mechanisms and enable prediction of hepatotoxicity early on in the drug development process.

The researchers behind this study conducted a review of the state-of-the-art AI techniques for drug-induced liver injury prediction, focusing on machine learning based approaches. They assessed three different machine learning methods:

  1. Rules and knowledge-based machine learning approaches
  2. Shallow machine learning methods
  3. Deep learning methods

Lack of drug-induced liver injury training data for predictive models

The researchers study hypothesised that machine learning approaches for drug-induced liver injury are limited by the availability of annotated databases. The DILIst dataset is one of the largest and most comprehensive drug-induced liver injury annotation datasets, yet it only comprises 1,279 drugs. Overall, the DILIst dataset is much smaller than other benchmarking datasets in drug discovery, which is critical from a machine learning perspective, especially for deep learning methods where their success depends on access to large amounts of data.

Analysis of AI models for predicting drug-induced liver injury

Several of the DILI prediction models reviewed in this study are based exclusively on exploiting the chemical structure of compounds. While the natural availability of structural information makes these approaches flexible, they do have their drawbacks. Some of the adverse reactions that are considered idiosyncratic may be undetectable from chemical structure alone, but may be predictable if genomic data is also considered. Therefore, the reviewed studies and models that focused on the exploitation of data are particularly important for increasing the understanding of the possible dependence of idiosyncratic DILI on genetic host factors.

Deep learning drug-induced prediction models have been proposed that are based on the chemical structure of compounds. However, these deep learning models don’t show an outstanding improvement in predictive performance, nor do they propose the possibility to replace in vitro or in vivo tests. Generally, the existing deep learning methods were based on processing pre-calculated molecular descriptors. Only one of the evaluated studies considered an end-to-end approach, building on an existing UG-RNN method, which was able to directly process the chemical structure of compounds and implicitly derive suitable molecular representations. Therefore, new advances in graph convolutional neural networks, which are also end-to-end, should also be investigated for DILI prediction.

Directions for advancing AI prediction methods

This study found that there was an opportunity for the improvement in the exploitation of in vitro 2D and 3D imaging data, namely by using advanced deep-learning-based computer vision methods. Existing image-based predictive models for DILI generally rely on standard computer vision techniques. Puri carried out a previous study using an automated ML engine to train a deep learning classifier for histopathology images. However, this study shared no details of the model design and architecture.

At present, the number of drugs with available 2D and 3D imaging data is limited, restricting the use of the existing machine learning models for the application of predicting drug-induced liver injury. The acquistion of imaging data will be vital for progress in this area.


The researchers hypothesises that new, more powerful deep learning methods for DILI prediction will be proposed in the near future. These will be both in the domains of imaging and graph convolutional neural networks. New predictive models with high predictive performance may open the door for these tools to not just be used for screening, but also potentially be used as ‘virtual assays’ to replace in vitro and in vivo tests. The effective prediction of drug-induced liver injury by these new tools could help to revolutionize drug development, helping us to understand which drugs may produce adverse reactions.

Image credit: pch.vector – FreePik

Share this article