Publication Date

9-6-2013

Comments

Technical Report: UTEP-CS-13-58

Abstract

Information retrieval systems rely heavily on models of similarity, but for spoken dialog such models currently use mostly standard textual-content similarity. As part of the MediaEval Benchmarking Initiative, we have created a new corpus to support development of similarity models for spoken dialog. This corpus includes 26 casual dialogs among members of two semi-cohesive groups, totaling about 5 hours, with 1889 labeled regions associated into 227 sets which annotators judged to be similar enough to share a tag. This technical report brings together information about this corpus and its intended uses.

Download

Included in

Computer Sciences Commons

COinS

Departmental Technical Reports (CS)

Data Collection for the Similar Segments in Social Speech Task

Publication Date

Comments

Abstract

Included in

Search

Links

Browse

Author Corner

Links

Departmental Technical Reports (CS)

Data Collection for the Similar Segments in Social Speech Task

Authors

Publication Date

Comments

Abstract

Included in

Share

Search

Links

Browse

Author Corner

Links