We summarize a recent study, published in Communications Biology, that outlines a new framework to audit and reduce AI bias in drug development.
Biases in data
Single-cell studies and biobanks have resulted in scientists turning to machine learning (ML) to drive meaningful interpretations of massive genomic, transcriptomic, proteomic, phenotypic, and clinical datasets. One obstacle to the development of reliable ML learning models is the lack of auditing for biases in life sciences. In addition to that, biological datasets often suffer from representational biases, inherent over- or under-representation of biological entities, and/or biases that are specific to, or induced by, different experimental conditions. When such biases are not eliminated, the ML process can be misled. This means the model learns predominantly from the biases unique to the training dataset. Therefore, the model is not generalizable across different datasets.
Applying ML to biological datasets
It is crucial to systematically audit for biases in data when applying ML to biological datasets. This will help in our understanding of how and what the model is learning in order to ensure that predictions are based on true biological insights from the data. The researchers behind this study thereby created a systematic auditing framework for paired-input biological ML applications. This is a class of ML prediction methods, where the goal is to predict the biological relationship between two entities.
They used their framework to examine biases in three ML applications of therapeutic interest. These are protein-protein interactions (PPIs), drug-target bioactivity and MHC-peptide binding. Predicting PPIs is critical to understanding the cellular functions of organisms, which is important to understand for bioengineering and de novo drug development.
Developing a framework to reduce AI Bias
The auditing machine learning framework the researchers developed has four modules: benchmarking, bias interrogation, bias identification, and bias elimination.
For the first module, the investigators established baseline performance by benchmarking classifiers on separate datasets. From the seven classifiers, five used support vector machines (SVMs), one used random forest, and one used a deep learning based stacked autoencoder. MATLAB and LibSVM were used as the support vector machine classifiers. In total, three databases of human proteins were used, and the classifiers were trained on subsets of a specific dataset. The researchers measured the average area under an AUC curve to assess benchmarking performance. It was found that performance was high across all classifiers.
Successful ML models should generalize, or apply what it has learned with a high degree of accuracy to independent datasets. To achieve this, the researchers created a Generalizability Auditor to achieve the second module of bias interrogation. An auditor is a system by which a ML model is compared to another ML model that is tailored to examine a specific hypothesis. This step compared a model’s original performance to that of a novel dataset, the Generalization dataset, to detect areas of bias.
The researchers then input any detected biases and bias hypotheses into the third module. This audits the bias for identification, resulting in the bias hypotheses being either rejected or confirmed. Finally, the fourth module tests the identified bias by assessing how the classifiers generalise after in separate datasets to eliminate the bias.
The importance of reducing AI bias
ML models can learn primarily from representational biases in the training data when there is insufficient signal in the training data representation. Research has shown that this appears to predominantly influence paired-input ML applications, and can mislead studies if not identified through auditing. This study outlined how to tailor their auditing framework to other biological ML applications, as well as code, resources and data that can be used to rerun or reposition the framework.
The framework developed in this study outlines a way to perform ML to predict biological relationships with reduced bias for greater accuracy and better outcomes, in processes such as drug development. This will mean that researchers can clarify whether their model has truly learned from governing biological principles, leading to more informed drug development paradigms.
Image credit: FreePik