Departmental Technical Reports (CS)

Dialogs Re-enacted Across Languages, Version 2

Nigel G. Ward, The University of Texas at El PasoFollow
Jonathan E. Avila, The University of Texas at El PasoFollow
Emilia Rivas, The University of Texas at El PasoFollow
Divette Marco, The University of Texas at El PasoFollow

Publication Date

6-1-2023

Comments

Technical Report: UTEP-CS-23-27

Abstract

To support machine learning of cross-language prosodic mappings and other ways to improve speech-to-speech translation, we present a protocol for collecting closely matched pairs of utterances across languages, a description of the resulting data collection and its public release, and some observations and musings. This report is intended for:

people using this corpus
people extending this corpus
people designing similar collections of bilingual dialog data.

Change Notes. This version supersedes UTEP-CS-22-108. There is some new information and numerous clarifications, mostly arising from our experiences diversifying our corpus and helping a vendor to use this protocol.

Download

Included in

Computer Sciences Commons, Mathematics Commons

COinS

Departmental Technical Reports (CS)

Dialogs Re-enacted Across Languages, Version 2

Publication Date

Comments

Abstract

Included in

Search

Links

Browse

Author Corner

Links

Departmental Technical Reports (CS)

Dialogs Re-enacted Across Languages, Version 2

Authors

Publication Date

Comments

Abstract

Included in

Share

Search

Links

Browse

Author Corner

Links