Mobile Menu

WELM-SURF: Predicting Drug-Target Interactions

Predicting novel drug-target interactions plays an important role in identifying new drug candidates and finding new proteins to target. An article, published in Biodata Mining, outlined a novel computational method – WELM-SURF – based on drug fingerprints and protein evolutionary information for identifying drug-target interactions.

Drug-target interactions (DTIs)

Developing a new chemistry-based drug usually costs billions of dollars, and it takes around a decade to bring a new drug to market. Despite this, only a few drug candidates are approved for marketing by the FDA. A major reasons for this is the lack of knowledge of DTIs, resulting in unacceptable toxicity of some drug candidates. Additionally, identifying interactions between a protein and a target can help in detecting new potential targets for existing drugs. Identifying all the potential targets of a drug could enable a better understanding of toxicity and treatment of other diseases.

In recent years, numerous experimental methods have been developed for identifying associations between drugs and target proteins. However, many of these existing methods are expensive and time-consuming.

Due to the inherent disadvantages of existing experimental methods for predicting DTIs, it has become paramount to develop efficient computational approaches.

Existing computational methods for predicting DTIs

As the focus has been shifted to developing computational approaches for predicting drug-target interactions, many related databases have been established. These databases contain information on relationships between a drug and its target. Examples databases include the Kyoto Encyclopaedia of Genes and Genomes (KEGG), DrugBank, and the Therapeutic Target Database (TTD). The contained data refers to experimental materials for researchers to develop new computational methods for detecting DTIs on a large-scale.

There are some existing computational methods for detecting DTIs. These can be classified into two categories: ligand-based virtual screening and docking simulation.

Ligand-based virtual screening compares the similarity of a given protein based on its chemical structure with a classical situation, action result (SAR) framework to predict DTIs. However, a disadvantage of this method is that it does not use protein domain information.

The second method – docking simulation – is a useful tool for molecular modelling. It is able to detect the positive interactions between drug molecules and proteins by dynamically simulating the binding between drug molecules and proteins. However, docking simulation has the disadvantage of only being able to be applied to proteins of a known 3D structure. This makes it difficult to meet the experimental conditions required for predicting DTIs.

Therefore, this study aimed to develop an efficient computational approach to improve the effectiveness and accuracy of predicting DTIs.

Developing WELM-SURF

These researchers proposed a novel computational method – WELM-SURF – based on drug fingerprints and protein evolutionary information for identifying DTIs. The Speed up robot features (SURF) detector and descriptor and can be used for object recognition, image registration, classification, or 3D reconstruction. In this case, the researchers employed it to extract protein sequence key features from the Position Specific Scoring Matrix (PSSM). The PSSM contains the positional information of a protein sequence, and the evolutionary information that reflects the conservative function of a protein. Therefore, it can provide help in extracting evolutionary information of a protein sequence.

For drug fingerprints, the chemical structure of molecular substructure fingerprints was used to represent drugs as a feature vector. The Weighted Extreme Learning Machine (WELM) was used to carry out classification based on extracted features for predicting DTIs. WELM was chosen as it has a short training time, good generalisation ability, and the ability to efficiently execute classification by optimising the loss of function matrix, which works by mapping the value of one or more variables onto a number. The overall prediction flowchart of WELM-SURF is shown in figure 1. This image was taken directly from the paper by An et al.

Figure 1. Flowchart showing how the WELM-SURF predicts DTIs. Taken directly from the article.

Effectiveness of WELM-SURF in predicting DTIs

The researchers evaluated the performance of the WELM-SURF model by experimental validations on enzyme, ion channel, GPCRs and nuclear receptor datasets by using a fivefold cross-validation test. Overall, WELM-SURF achieved average accuracies of 93.54%, 90.58%, 85.43% and 77.45% on enzyme, ion channels, GPCRs and nuclear receptor dataset, respectively. The researchers also compared the performance of WELM-SURF with the Extreme Learning Machine (ELM), and a state-of-the-art Support Vector Machine (SVM) on the enzyme and ion channels datasets. ELM and SVM are exiting methods of predicting DTIs. The researchers hoped to demonstrate that their method could provide greater accuracy when predicting DTIs. They displayed the results of their comparison in a receiver operating characteristic curve (ROC curve).

An average accuracy of 90.38% and 87.07% was obtained using the ELM classifier and SVM classifier on enzyme dataset. Whereas, the WELM classifier achieved 93.54% average accuracy. Similarly, on the ion channels dataset, an average accuracy of 87.76% and 83.30% was obtained by using the ELM classifier and SVM classifier. Meanwhile, the WELM-SURF classifier achieved 90.4% average accuracy on the ion channels dataset. Experimental analysis demonstrates that WELM-SURF has a better prediction capacity than the existing ELM and SVM classifiers. These results may be due to the short training time and good generalization ability of WELM.


This paper demonstrated that the WELM-SURF model is able to predict DTIs with higher accuracy and robustness than existing methods. The researchers hope that their model will be a useful computational tool to facilitate future studies in the prediction of DTIs, and ultimately play an important role in identifying new drug candidates and finding new proteins to target.

Image credit: mdjaff – FreePik

Share this article