Researchers from the University of Virginia School of Medicine have developed a new database of transcriptional regulators, which will help scientists to understand dysregulations in gene expression that result in cancer and its progression, enabling the development of better cancer treatments and prevention methods.
Dysregulation in gene expression
Cancer and its progression arise from dysregulations in gene expression. The identification of transcriptional regulators that control oncogenic gene expression in each cancer type is a focus of cancer research, since these regulators could be important targets for novel therapies.
The Cancer Genome Atlas (TCGA) is one of the largest resources for gene expression profiles in human cancers, containing more than 10,000 RNA-seq datasets for tumour and normal samples for over 30 cancer types. Knowing which transcriptional regulators result in differential expression between tumour and normal samples can help researchers to better understand functional gene regulatory networks in each cancer type. However, due to a limited number of cell numbers for most primary tumours, the TCGA did not produce ChIP-seq data. Therefore, computational prediction of functional transcriptional regulators from differential gene expression data from TCGA will be useful in understanding transcriptional regulation.
Developing an algorithm for inferring functional transcriptional regulators
The team behind this study previously developed Binding Analysis for Regulation of Transcription (BART), a novel computational method for inferring the function of transcriptional regulators from a target gene set. The BART algorithm leverages a large collection of over 7,000 human transcriptional regulator binding profiles and 5,000 mouse transcriptional regulator binding profiles from the Cistrome Data Browser database to build the inference model. When a gene list is used as the input, BART first uses collected ChIp-seq profiles of H3K27ac, an active enhancer histone mark, to predict a cis-regulatory profile that regulates the gene set.
Each transcriptional regulator ChIP-seq dataset from the compendium is then mapped to the cis-regulatory profile and scored based on whether the transcriptional regulator binding sites overlap with high-ranking regions in the cis-regulatory profile. This is quantified using a receiver-operating characteristic (ROC) measure. The algorithm then integrates the scores for all ChIP-seq datasets for the same transcriptional regulator to generate transcriptional regulator quantifications. Using BART, researchers are able to identify not only regulators that promote tumorigenesis, but also regulators that act as tumour suppressors.
The BART cancer database of transcriptional regulators
The researchers at the University of Virginia used BART to map the folding patterns of our chromosomes in three dimensions using the data contained in TCGA. In this study, the team worked to develop a successful approach that is able to link the dynamic folding pattern of genes to the control of gene activities. They hoped that by providing an approach to better understand the transcriptional regulators, they could help to unravel the genetic cause of cancer and other diseases.
They termed their method BART3D, which works by comparing the available three-dimensional configuration data from one region of a chromosome with its neighbours. BART3D is then able to use the BART algorithm to extrapolate from this comparison to fill in blanks in the sequence of genetic material. This produces a map that offers insight into how our genes interact with the transcriptional regulators that control their activity. Researchers can then identify which regulators turn particular genes on and off in certain diseases.
The team also built the BART Cancer database to advance research into 15 different types of cancer, including breast, lung and prostate cancer (figure 1). Researchers can use this database to search for regulators that are more active, and those that are less active in each cancer type. The lead researcher behind this study, Professor Chongzhi Zhang, said that their database can be used to screen potential for potential drug targets, and that they hope the resource will benefit the whole biomedical research community by accelerating scientific discoveries and future therapeutic development.

Summary
This study outlined a novel algorithm – BART – that can be used to infer functional transcriptional regulators from differentially expressed genes. They used their algorithm to develop the BART cancer database of transcriptional regulators, which researchers can use to screen potential drug targets and analyse their own genetic data. Therefore, the team hope that their database will be used to accelerate scientific discoveries and future drug development for cancer and other diseases.
Image credit: kjpargeter – FreePik