Computational Methods of Hidden Markov Models With Respect To CpG Island Prediction in DNA Sequences
Date of Award
2011-01-01
Degree Name
Master of Science
Department
Statistics
Advisor(s)
Ming-Ying Leung
Abstract
Hidden Markov models (HMM's) are a specific case of Markov models where, contrary to Markov chains, the observer is unaware of what state the model was in when the symbol is observed. Like Markov chains, HMM's assume that the future state of a sequence is dependent only on the current state of the sequence. The parameters associated with HMM's are transition and emission probabilities, where transition probabilities are associated with the probability of transitioning from one state to another, and emission probabilities are the probabilities associated with observing a symbol given it came from a specific state.
The structure of DNA sequences is made up of the nucleotides adenine (A), cytosine (C), guanine (G), and thymine (T). CpG islands are regions within the DNA sequence where there is a higher occurrence of the CG dinucleotide.
The HMM algorithms used to analyze the DNA sequences were the Viterbi, Baum-Welch, and Viterbi training algorithms. The Viterbi algorithm determines the state-sequence that is most likely to have produced the given sequence, given the model. The Baum-Welch and Viterbi training algorithms estimate the parameters associated with an HMM.
In specific, we have assessed the accuracy of the aforementioned Viterbi algorithm at predicting the location of CpG islands within DNA sequences as well as determine the strength of the parameter estimating algorithms at recovering the model parameters.
Language
en
Provenance
Received from ProQuest
Copyright Date
2011
File Size
166 pages
File Format
application/pdf
Rights Holder
Roberto Angel Ortega
Recommended Citation
Ortega, Roberto Angel, "Computational Methods of Hidden Markov Models With Respect To CpG Island Prediction in DNA Sequences" (2011). Open Access Theses & Dissertations. 2354.
https://scholarworks.utep.edu/open_etd/2354