Tilbake til søkeresultatene

IS-MOBIL-Mobilitetsprogr.f.utl.Ph.D-stu

Automatic extraction of valency lexicons from Latin corpora

Tildelt: kr 0,13 mill.

The goal of this project is to create two new linguistic resources for Latin, namely two computational valency lexicons, i. e. dictionaries containing syntactic, semantic, and frequency information on the arguments of verbs. Valency lexicons distinguish t ransitive verbs from intransitive verbs and record semantic information on the nouns occurring in the subject or object positions. To illustrate their value for traditional linguistics, the lexicons will be used to conduct two studies on a particular aspe ct of Latin syntax. So far, I have completed the automatic extraction of two valency lexicons respectively from the Index Thomisticus Treebank, a syntactically annotated corpus for medieval Latin, and from the Latin Dependency Treebank, on texts of the Cl assical era. These two lexicons record syntactic information on verbal arguments. My plan is now: 1) to enrich the two lexicons with semantic information on verbs? selectional preferences; 2) to use the lexicons as empirical evidence supporting a linguis tic study on Latin syntax in a contrastive perspective, comparing classical Latin and medieval Latin. My stay at the University of Bergen will focus on the methodological aspects of the project and involves adapting advanced computational linguistics and corpus linguistics methods developed for extant languages to data from an extinct and less-resourced language like Latin. Therefore, it requires knowledge in Latin linguistics as well as in statistical and computational techniques. The environment at the University of Bergen will contribute with a vital methodological complementarity to my work. This research will shed light on Latin lexicography and syntax as well as on various areas of Natural Language Processing and computational lexicography, showing the potential benefits of applying computational and statistical methods to Latin linguistics. As such, this project situates itself within the broader field of Language Technologies for Cultural Heritage Data.

Budsjettformål:

IS-MOBIL-Mobilitetsprogr.f.utl.Ph.D-stu