Using emotion as inferred from prosody in language modeling

Shreyas Ashok Karkhedkar, University of Texas at El Paso


Research has focused on using prosody as an alternative source of information for language modeling. However, prosody is a surface phenomenon and to develop deeper models of language production, the underlying mental processes need to be considered. There are several cognitive factors, such as dialog-states and formulation, that have been given attention. However, emotion { as a cognitive factor, has been neglected so far. Speakers' emotional state plays an important role in spoken dialog. Participants seem to infer each others emotional state from multiple cues and react accordingly. In particular, these states manifest themselves moment-by-moment in the speakers voice. This dissertation attempts to model these changes and use them to improve word probability estimates for language model. A small set of conversations from the Switchboard corpus was labeled for emotion. Rather than using a class-based approach, I have used a dimension-based approach, to account for the subtle changes in emotion that occur in spontaneous dialog. I developed several models using dierent machine learning techniques that estimate the emotion value on each dimension independently. Once the speaker's emotional state is recognized, the probability estimates of words that the speaker might say next are rened based on that recognized state. Using emotion for language modeling in this way resulted in a 1.35% reduction in perplexity over the baseline.

Subject Area

Psychology|Computer science

Recommended Citation

Karkhedkar, Shreyas Ashok, "Using emotion as inferred from prosody in language modeling" (2013). ETD Collection for University of Texas, El Paso. AAI3566544.