Mobile Menu

COVID-19: Are we asking machine learning the right questions?

COVID-19 has put unprecedented strain on healthcare agencies across much of the developed and developing world. Governments have scrambled to accrue adequate supplies of critical resources, from personal protective equipment to the ventilators needed to treat those suffering the most severe form of the respiratory failure associated with the novel virus.

With large amounts of diagnostic imaging data already available from COVID-19 patients, from both x-ray and CT scans, one of the most obvious applications in which we can leverage the much-hyped power of advanced analytics to accelerate therapeutic development is to use machine learning to analyse the images. However, whilst researchers have rapidly developed various machine learning algorithms for image analysis, it remains to be seen how many of these will prove to be clinically useful in the long run.

For instance, last month we reported on how healthcare workers from John Hopkins University examined where, despite the opportunities to improve COVID-19 patient outcomes, AI-based clinical decision support (CDS) systems have yet to demonstrate substantial progress.

In a comment piece for the Lancet, Dr Patrik Bachtiger and colleagues at Imperial College London, discuss how we must first define what the clinical need is for these advanced technologies, before their application, otherwise we risk creating more problems than solutions. A common pitfall they explain is researchers, without clinical oversight, applying artificial intelligence tools to a dataset without a clear problem in mind, a “form of supply trying to find demand, rather than the other way around.”

Whilst there have been recent examples of success in AI-based medical imaging systems, such as a collaboration between Google Deepmind and Moorfields Eye Hospital in London, a relatively small number of these research projects are validated in clinical trials. With the urgency presented by COVID-19, researchers risk focusing too heavily on developing and deploying new ML models without fully comprehending the potential biases that may ultimately limit the practical applicability of these tools.

What’s also true, is that whilst many publications report trialling their AI tools in a clinical setting, in reality, many of these scenarios do not represent real-world practices. Similarly, whilst an algorithm when put head to head with radiologists, may outperform the human diagnosis, it will only do so based on a specific or narrow range of clinical abnormalities.

The potential biases of training algorithms, the authors say, are very likely to be multiplied in a pandemic situation where disease rates are artificially high and hospital admissions are tipped towards the more severe cases as many regions find themselves at near capacity. Algorithms should instead be trained on the full spectrum of COVID-19 or any disease for that matter, to include asymptomatic cases. Without doing so, we cannot have confidence in the real-world application of these tools. 

Furthermore, the authors point out the data labelling strategy must be carefully considered in algorithm training. Should only PCR positive patients be included in the dataset? What range of potential symptoms are accepted? Does the patient’s contact history impact model training?

Over the past few months as we have rapidly gained a clearer understanding of the variability in the symptoms and disease presentation it has become apparent that the disease progresses in stages, increasing in severity. Predicting and intervening before significant deterioration has become a priority for both researchers and healthcare workers. The opportunity to use machine learning in image analysis to recognise predictors of COVID-19 infection in CT scans and influence clinical decision making at the earliest stages was one that was immediately seized by the research community. By doing so they recognised they could help to avoid the more severe cases requiring the very limited number of ventilators for treatment and consequently overwhelming numbers of patients requiring critical care.

The authors note that whilst several imaging initiatives have had early successes, their focus on diagnosis has potentially limited the research questions towards other priority problems including outcome prediction. If we were to ask these research questions, employing baseline or short-term data the results could be more useful that diagnosis efforts. They also point out that ML algorithms tend to be modular, meaning that those developed during this pandemic could be repurposed in the next. Or for the diagnosis and treatment of other respiratory diseases.

With an optimistic outlook, the authors envisage that the clinical and data science collaborations forged during the current public health crisis may mark the beginning of the road towards true AI implementation in clinical practice.

Share this article