Back to search

FORINFRA-Nasj.sats. forskn.infrastrukt

Medieval Norwegian Text Corpus

Awarded: NOK 7.0 mill.

Project Number:

195309

Application Type:

Project Period:

2010 - 2013

Location:

Menotec (Medieval Nordic Text Corpus) is a three-year project (2010-2012) which will build a large and balanced corpus of Medieval Norwegian texts, covering the period from around 1150 to around 1550. It will extend the Medieval Nordic Text Archive (http: //www.menota.org) which by now contains ca. 0.5 M words. By the end of 2012, the Menotec corpus will reach a total of 1.5 M transcribed words, of which 1.0 M have been morphologically annotated and 0.5 M syntactically annotated. It will become the largest corpus of Medieval Norwegian texts available anywhere, and the amount of linguistic annotation will be far above comparable corpora. It will be an open access archive of national standing, distributed between the two major research centres at the univers ities of Oslo and Bergen, and it will also be linked to several international projects. Menotec will enable a large number of linguistic and to some extent literary projects and be instrumental for several ongoing projects, within areas such as language h istory, grammar, lexicography and comparative syntax. The morphological annotation is based on a Text Encoding Initiative (TEI) compatible scheme developed by Menota and already used for a corpus of 0.2 M words. The syntactical annotation will be based o n the PROIEL project (www.hf.uio.no/ifikk/proiel) which by now has considerable knowledge about creating treebanks for ancient languages (Latin, Greek, Gothic, Old Church Slavic). The syntactic annotation scheme used in the PROIEL project is a variant of dependency grammar designed to be compatible with Lexical-Functional Grammar, and thus compliant with the large-scale LFG-based grammar for Modern Norwegian developed in the NorGram, LOGON and TrePil projects. The morphological annotation will be compatib le with the lexicographical work being done by the Norsk Ordbok 2014, and it will enable a pan-Nordic meta dictionary, linking the lexicographical resources of Medieval Norwegian, Icelandic, Swedish and Danish.

Publications from Cristin

No publications found

No publications found

No publications found

No publications found

Activity:

FORINFRA-Nasj.sats. forskn.infrastrukt