Date of Award
2012-01-01
Degree Name
Master of Science
Department
Computer Science
Advisor(s)
Nigel G. Ward
Abstract
Previous studies show that immediate and long range prosodic context provide benecial information when applied to a language model. However, the fact that some features provide more information to the prediction task should be considered. If the information contribution of each feature can be determined, then a well-crafted feature set can be built to improve the performance of a language model. In this study, I measure the contribution of dierent prosodic features to a baseline trigram model. Using this information, it should be possible to build a language model that uses the most informative resources and ultimately performs better than a language model that includes prosodic information naively. Using this information, I build a prosodic feature set of 103 prosodic features from past and future context computed for both speaker and interlocutor. Principal component analysis is applied to this feature set to build a model that achieves a 25.9% perplexity reduction relative to a tri-gram model. However, this model falls short of performance improvements achieved by a similar model without proper feature selection by -1.2%.
Language
en
Provenance
Received from ProQuest
Copyright Date
2012
File Size
89 pages
File Format
application/pdf
Rights Holder
Alejandro Vega
Recommended Citation
Vega, Alejandro, "On the Selection of Prosodic Features for Language Modeling" (2012). Open Access Theses & Dissertations. 2211.
https://scholarworks.utep.edu/open_etd/2211