The use of routine patient data in the form of administrative databases, electronic health records, and clinical registries in biomedical research has been met with both support and opposition. On one hand, these data can be used to generate knowledge depth about disease progression and treatments when applied in research on the general population. However, ethical considerations on patient privacy remains unanswered.
Peek and Rodrigues from the University of Manchester and University of Porto, respectively, discuss three main challenges of implementing health data in research, in an article published in Springer Link:
1. Usage of data for the purposes for which they are collected
In 1991, Johan van der Lei proposed the 1st Law of Medical Informatics, which states that health data should only be used for the purpose for which they are collected because the data can be easily misinterpreted in other uses with other contexts.
Due to huge variability in how physicians use electronic health records, diagnostic information in patients can vary, as only the physician in charge will understand his/her recording habits. These variations mean there is a lack of consistent information. There also exists biases and inequalities in routine data, as some patients do not have access to or choose not to access healthcare services. Patients who do access healthcare services regularly are usually older and sicker patients, resulting in biased datasets. This makes routine health data not a good translatable source of health data to population health.
However, the wealth of data that represent real-life scenarios can better inform physicians’ decisions. This is because the information for the decision-making process is from the very same system that generates the data.
2. The pros and cons of replacing traditional medical research methods
Traditionally, randomised clinical trials (RCT) are conducted to assess the safety of a treatment before they become commercially available. However, the advancement of big data and predictive analytics that can obtain strong statistical correlations about the safety of a treatment, suggest that RCT is no longer needed.
The influence of pharmaceutical companies on publishing clinical trials data that support the effectiveness of a drug that is in development is not unheard of. The recruitment process for RCT that only selects for patients without any comorbidity does not reflect the current state of our aging population with multiple diseases. Routine data offers an alternative that is cheaper and more representative of the population because it relies on existing data sets that can be publicly funded to prevent any corporate biases.
However, routine data offers a retrospective, observational study design, whereas RCT offers a prospective study design. Using predictive analytics from larger health data sets does not escape the presence of confounding biases that exist in clinical trials. This is because there will always be unmeasured confounding variables in observational studies. It only skews our perception of these biases, as more statistically significant results are obtained.
3. Obtaining explicit informed consent from patients
The possibility of data breaches may result in the reluctance of patients seeking medical treatment and largely affect public trust in the healthcare system. As patients do not know which clinical trials their data may be used for, they may actually object to the nature of the studies due to possible religious and moral objections. Informed consent is imperative to ensure a “social license for research”, which accounts for voluntary and non-exploitative participation, is respected.
Alternatively, the risks from not sharing health data, such as poorer health outcomes and inferior decision-making in treatments, can outweigh the arguments against it. Obtaining consent is laborious and can lead to selection bias, favouring patients who do consent. But there is still a lack of predictive research into the large-scale harms from non-sharing of health data.
Moving forward, we need better analytical tools to capture context information when recording patient information. The risks and costs of using only routine data over RCT should be assessed objectively. Effective information governance controls also have to be set up to minimise the risk of patient identification or misuse of data. Both sides of the argument add to the nuances and complexities of this issue. Therefore, we should be aware of these arguments and respect their validity in order to have productive conversations surrounding this topic.
Journal reference: Three controversies in health data science
Image credit: Freepix