Date of Award
2013-01-01
Degree Name
Doctor of Philosophy
Department
Computer Science
Advisor(s)
Nigel G. Ward
Abstract
Research has focused on using prosody as an alternative source of information for language modeling. However, prosody is a surface phenomenon and to develop deeper models of language production, the underlying mental processes need to be considered. There are several cognitive factors, such as dialog-states and formulation, that have been given attention. However, emotion { as a cognitive factor, has been neglected so far. Speakers' emotional state plays an important role in spoken dialog. Participants seem to infer each others emotional state from multiple cues and react accordingly. In particular, these states manifest themselves moment-by-moment in the speakers voice. This dissertation attempts to model these changes and use them to improve word probability estimates for language model. A small set of conversations from the Switchboard corpus was labeled for emotion. Rather than using a class-based approach, I have used a dimension-based approach, to account for the subtle changes in emotion that occur in spontaneous dialog. I developed several models using dierent machine learning techniques that estimate the emotion value on each dimension independently. Once the speaker's emotional state is recognized, the probability estimates of words that the speaker might say next are rened based on that recognized state. Using emotion for language modeling in this way resulted in a 1.35% reduction in perplexity over the baseline.
Language
en
Provenance
Received from ProQuest
Copyright Date
2013
File Size
112 pages
File Format
application/pdf
Rights Holder
Shreyas Ashok Karkhedkar
Recommended Citation
Karkhedkar, Shreyas Ashok, "Using Emotion As Inferred From Prosody In Language Modeling" (2013). Open Access Theses & Dissertations. 1851.
https://scholarworks.utep.edu/open_etd/1851