Genetic and epigenetic changes underpin the development of a plethora of biological conditions that can be acquired over the course of a lifetime. A recent study to classify cancer phenotypes used a deep learning-based predictive model to quantify these changes in relation to directionality of gene expression.
Gene expression can be influenced by several factors, most significantly copy number variation or alteration (CNV/CAN), DNA mutations, and epigenetic factors such as DNA methylation (DNAm) or histone modifications. As a growing body of evidence has shown epigenetic, as well as genetic events, play an interwoven role in tumour development and progression gaining a better understand of these mechanisms is crucial.
To this day we lack a robust integrative predictive model to estimate factors affecting gene expression. Therefore, the authors from various research institutes in India and Novo Nordisk developed their model using Deep Nenoising Auto-encoder (DDAE) and Multi-layer Perception (MLP) to quantitatively measure genetic and epigenetic alterations. These can then correlated with the directionality of gene expression in liver hepatocellular carcinoma.
The team trained the DDAE to extract significant features from the multi-omic data inputs before applying the MLP for back-propagation learning for the task of regression and tumour classification. The integration model, upon evaluation for disease classification capability, had an accuracy rate of 95.1%.
“The proposed predictive model filters signals from noise contributed via both these genomic and epigenomic platforms, understands the non-linear relationships among the input features, and finally captures the influence of these relationships to extract information encoded in mRNA expression for paired sets of patient samples.”
In addition to predicting expression patterns, the authors say the model serves as a promising platform for multi-omics integration. In future work, they intend to extend this study to include clinical samples from Tissue Biobanks where estimated gene expression could be compared directly with the true gene expression data of paired samples.
Original study published in Genomics: Estimating gene expression from DNA methylation and copy number variation: A deep learning regression model for multi-omics integration D.B. Seala, V Dasb, S Goswamic, R.K. De