Mobile Menu

Cancer driving mutations identified using machine learning

Researchers from The Biomedical Genomics Laboratory at IRB Barcelona have developed a computational tool that identifies cancer driving mutations for different tumour types. This machine learning approach will help to accelerate cancer research and provide oncologists with a tool to help choose the best personalised cancer treatments.

Cancer driving mutations

The sequencing of tens of thousands of tumours over the last decade and a half has uncovered the contributions of different mutational processes to the emergence of variation in different cancer types. This sequencing data has also shown that over 500 genes are under positive selection in different tumour types. However, distinguishing the driver mutations from passenger mutations in these cancer genes is still a largely unsolved issue. Moreover, around 90% of the variants observed in cancer genes across tumours are of unknown significance to the development of the malignancy.

Determining a mutation’s relevance for cell transformation is important in identifying which mutations in a patient’s tumour are potentially relevant in the clinic, and for the development of personalised cancer treatment plans.

Developing a machine learning tool to identify cancer driving mutations

In an attempt to identify cancer driving mutations, researchers at IRB Barcelona have developed a tool, based on machine learning, that evaluates the potential contribution of all possible mutations in a gene in a given type of tumour to the development and progression of different types of cancer.

In previous work, the team developed a method to identify those genes responsible for the onset, progression and spread of cancer. However, in this recent study the team went further and developed a tool, known as BoostDM, which simulates each possible mutation within each gene for a specific cancer, and indicates which mutations are key in the cancer process. This will help researchers to understand what causes a tumour at the molecular level. It will also help to support medical decisions regarding the development of personalised treatments.

The BoostDM approach currently works with the data from the mutational profiles of 28,000 genomes analysed from 66 types of cancer. The team also hope to grow the platform as more publicly accessible cancer genome data becomes available.

BoostDM: using evolutionary biology to identify cancer driving mutations

To identify the mutations involved in cancer, the scientists based their approach on the evolutionary concept of positive selection. Thereby, mutations that drive the growth and development of cancer are found in higher numbers in distinct samples, compared to those that occur randomly. The researchers started with a premise that some mutations are only observed as the tumour cells with this mutation guide the development of the tumour. The team then questioned what distinguishes these mutations from other possible mutations.

From the data, BoostDM learns what attributes are distinctive of the mutations that favour the development of cancer. This information is useful for the development of new therapeutic approaches. Moreover, when compared with experimental approaches, the researchers found the BoostDM approach to be more efficient and accurate.

Developing a computational model for each gene and type of cancer

The BoostDM tool has already generated 185 models to identify mutations in a specific gene in a specific type of cancer. For example, the tool has produced a model that has identified all the possible mutations in the EGFR gene that trigger tumour development in some lung cancers. The team has also created another model for the same gene in glioblastoma, which is a type of cancer that affects the brain.

As more sequencing data in tumours becomes publicly accessible, the team will incorporate it into the system, allowing BoostDM to generate new models for all cancer genes in the coming years. When a model has been developed, researchers can then interrogate each possible mutation of a cancer gene in a tissue type (a process known as saturation mutagenesis), and determine whether it is relevant for the development of the disease. This process will create a map of the key cancer driving mutations, which is valuable for both cancer research and the development of personalised cancer medicine.


This study developed a machine learning tool – BoostDM – that identifies cancer driving mutations for each cancer type by generating a model for each gene and type of cancer. As new data becomes available, the team will incorporate it into the BoostDM tool to generate new models and identify cancer driving mutations, which will help to create a map of mutations that result in the onset and progression of different types of cancers. The team hope that this approach will be an important tool to facilitate the development of personalised cancer medicines.

Image credit: vectorjuice – FreePik

Share this article