Australian English Bilingual Corpus: Automatic forced-alignment accuracy in Russian and English

Ksenia Gnevsheva, Simon Gonzalez Ochoa, Robert Fromont

    Research output: Contribution to journalArticle

    Abstract

    This paper introduces the Australian English Bilingual Corpus, a Russian–English spoken corpus, and uses it for a comparison of automatic time alignment between two different languages. Automatic forced alignment is gaining popularity in corpus research as it allows for time-efficient processing of phonetic information. The Language, Brain and Behaviour: Corpus Analysis Tool is one aligner which compares well with others in terms of alignment accuracy. Most of the forced-alignment work has been done with different varieties of English. This paper compares alignment accuracy between Russian and English and discusses aligner settings and data characteristics that affect it. The results suggest higher alignment accuracy for English than Russian. For Russian, alignment accuracy improves with stress specification; that is, when stressed and unstressed vowels are treated as separate categories.
    Original languageEnglish
    Pages (from-to)182-193
    JournalAustralian Journal of Linguistics
    Volume40
    Issue number2
    DOIs
    Publication statusPublished - 2020

    Fingerprint

    Dive into the research topics of 'Australian English Bilingual Corpus: Automatic forced-alignment accuracy in Russian and English'. Together they form a unique fingerprint.

    Cite this