Date of Award
2020-01-01
Degree Name
Doctor of Philosophy
Department
Computational Science
Advisor(s)
Suman Sirimulla
Abstract
Machine learning (ML) techniques have been widely applied in a variety of areas ranging from pattern recognition, natural language processing, and computer games to self-driving cars, clinical diagnostics, and molecular structure prediction easing day to day life of human beings. Drug discovery is an expensive, complex, and time taking process. Currently, the pharma industry is hoping to leverage machine learning methods in expediting the drug discovery process. Molecular property prediction is one of the most important tasks in drug discovery. While developing a new drug relies on a proper understanding of molecular properties, there has been great interest in the potential of machine learning models to predict molecular properties. In this dissertation, I have benchmarked several ML algorithms against a variety of drug discovery related property predictions. More specifically, a comparison of several widely used ML algorithms with advanced ML algorithms such as Direct Message Passing Neural Network (D-MPNN) is discussed in various molecular property prediction models. The traditional ML models are trained on computed molecular fingerprints whereas D-MPNN is a graph-based neural network that learns by operating on the graph structure of the molecule. The work presented in this dissertation is available as free and user-friendly computational tools that cover a wide range of biochemical tasks such as binding affinity calculation, drug-target activity prediction and absorption, distribution, metabolism, excretion, and toxicity (ADMET) properties prediction. They can be accessed online via webservers (https://drugdiscovery.utep.edu) and (https://drugcentral.org/). The source code and datasets are available in GitHub (https://github.com/sirimullalab) for interested computational scientists to further validate and benchmark new algorithms.
Language
en
Provenance
Received from ProQuest
Copyright Date
2020-12
File Size
152 pages
File Format
application/pdf
Rights Holder
Govinda Bahadur KC
Recommended Citation
Kc, Govinda Bahadur, "Benchmarking Machine Learning Methods For Molecular Property Prediction" (2020). Open Access Theses & Dissertations. 3167.
https://scholarworks.utep.edu/open_etd/3167