Publication Date





In dialog, the proper production of back-channels is an important way for listeners to cooperate with speakers. Developing quantitative models of this process is important both for improving spoken dialog systems and for teaching second language learners. An essential step for the development of such models is labeling all back-channels in corpora of human-human dialogs. Currently this is done by hand. This report describes a method for automatically identifying back-channels in conversation corpora, using only the patterns of speech and silence by the speaker and the listener in the local context. Tested on Arabic, Spanish, and English, this method identifies most of the actual back-channels, but it also mistakenly identifies many utterances which are not-back-channels: across 293 minutes of data in these three languages, the coverage is 70.2% and the accuracy is 40.8%. Thus this method is probably useful not as a fully automatic method, but only as a way to reduce the amount of human labor required. The method is parameterized, and it is possible to obtain slightly better performance by replacing the generic, language-independent parameters with language-specific parameters.