Date of Award
2018-01-01
Degree Name
Master of Science
Department
Computer Science
Advisor(s)
Olac Fuentes
Second Advisor
Suman Sirimulla
Abstract
In recent years, the cheminformatics community has seen an increased success with machine learning-based scoring functions for estimating binding affinities. The prediction of protein-ligand binding affinities is crucial for drug discovery research. Many physics-based scoring functions have been developed over the years. Lately, machine learning approaches are proven to boost the performance of traditional scoring functions. In this study, two scoring functions were developed; one is based on the Convolutional Neural Networks and the other one, called DLSCORE, is based on an ensemble of fully connected neural networks. Both the models were trained on the refined PDBbind (v.2016) dataset using different types of features. The results obtained from the CNN model was analyzed to show that nearest neighbor features are better than the distributed features. Moreover, canonically oriented molecular structures were proved to be better than the randomly oriented structures. The DLSCORE model which is an ensemble of 10 different networks, yielded a Pearson correlation coefficient of 0.82, a Spearman Rho coefficient of 0.90, Kendall Tau coefficient of 0.74, an RMSE of 1.15 kcal/mol, and an MAE of 0.86 kcal/mol for the test set, outperforming two very popular scoring functions.
Language
en
Provenance
Received from ProQuest
Copyright Date
2018-08
File Size
72 pages
File Format
application/pdf
Rights Holder
Md Mahmudulla Hassan
Recommended Citation
Hassan, Md Mahmudulla, "Deep Learning Models For Scoring Protein-Ligand Interaction Energies" (2018). Open Access Theses & Dissertations. 1447.
https://scholarworks.utep.edu/open_etd/1447