ParGramBank: The ParGram Parallel Treebank

Sebastian Sulger, Miriam Butt, Tracy Holloway King, Paul Meurer, Tibor Lazcko, Gyorgy Rakosi, Cheikh Bamba Dione, Helge Dyvik, Victoria Rosen, Koenraad De Smedt, Agnieszka Patejuk, Ozlem Cetinoglu, I Wayan Arka, Meladel Mistica

    Research output: Contribution to conferencePaper

    Abstract

    This paper discusses the construction of a parallel treebank currently involving ten languages from six language families. The treebank is based on deep LFG (LexicalFunctional Grammar) grammars that were developed within the framework of the ParGram (Parallel Grammar) effort. The grammars produce output that is maximally parallelized across languages and language families. This output forms the basis of a parallel treebank covering a diverse set of phenomena. The treebank is publicly available via the INESS treebanking environment, which also allows for the alignment of language pairs. We thus present a unique, multilayered parallel treebank that represents more and different types of languages than are available in other treebanks, that represents deep linguistic knowledge and that allows for the alignment of sentences at several levels: dependency structures, constituency structures and POS information
    Original languageEnglish
    Pages550-561
    Publication statusPublished - 2013
    EventAnnual Meeting of the Association for Computational Linguistics - Sofia Bulgaria
    Duration: 1 Jan 2013 → …

    Conference

    ConferenceAnnual Meeting of the Association for Computational Linguistics
    Period1/01/13 → …

    Fingerprint Dive into the research topics of 'ParGramBank: The ParGram Parallel Treebank'. Together they form a unique fingerprint.

    Cite this