Adding within-utterance emotion decay for more human-like dialog
While spoken dialog systems have been used for commercial applications for several decades, most commercial spoken dialog systems provide only simple information exchange capabilities. Emotion synthesis in spoken dialog systems has become an active research area recently, and use of emotion-adaptive dialog systems has demonstrated improvements in user experience and rapport. This thesis seeks to improve how emotions are conveyed in dialog systems to enable robust emotional support that improves user experiences with dialog systems and models human speech characteristics more accurately than current dialog systems. Prior work with Gracie (GRAduate Coordinator with Immediate response Emotions), an emotion-adaptive dialog system, enabled system utterances that conveyed emotional qualities based on the perceived emotional state of the user. This feature improved the user experience with the dialog system through increased rapport. However, Gracie was able to convey only a constant emotion during a conversation turn, regardless of the length of the turn. This thesis extends Gracie to modify the emotional qualities of system utterances on a sub-turn level. In the study carried out in this thesis, the emotional coloring was varied on a sub-turn level by linearly attenuating the emotional qualities so that they reached a neutral emotional state at the utterance end. Evaluation with 36 subjects showed that they significantly preferred conversing with the version of Gracie that supports sub-turn emotional coloring over the original Gracie. Subjects also tended to rate the sub-turn coloring Gracie system as more human-like than the original system.
Communication|Cognitive psychology|Computer science
Durcholz, Michael Hans, "Adding within-utterance emotion decay for more human-like dialog" (2012). ETD Collection for University of Texas, El Paso. AAI1512564.