Back to search

FORINFRA-Nasj.sats. forskn.infrastrukt

LIA - Language Infrastructure made Accessible

Awarded: NOK 27.3 mill.

The aim of Language Infrastructure made Accessible (LIA) was to gather as many recordings as possible from the four universities: the University of Oslo, the University of Bergen, the University of Tromsø and the Norwegian University of Science and Technology. Tape with soundsignal above and a sky above that. Photo: Tekstlaboratoriet, illustrasjon. About the Project During the last 60 years speech recordings for various purposes, mainly dialect and onomastic research, have been made across Norway by Norwegian universities, covering many Norwegian and Sami varieties. While some of these have now been digitised and catalogued in a systematic manner, others lie in archive cabinets and drawers. Many of them are in danger of being destroyed. Objectives The recordings gathered by the project was digitised by the National Library, and then inventorised and catalogued. The most interesting recordings of good audio quality was further processed by the LIA project. They were transcribed, tagged, parsed, and of course text-sound synchronised. Finally, the audio files and the transcribed recordings was posted in the New Glossa, a user-friendly corpus search interface. The recordings and transcriptions is also freely available via a download page. The recordings in the LIA is of two types: Diachronic data: Dialect recordings from throughout Norway, including recordings in Sami. Norwegian in America: Recordings from fieldwork in America from 1931 to present.

The goal of the Language Infrastructure made Accessible (LIA) project is to adapt currently unavailable Norwegian and Sami language data into a user-friendly research infrastructure that will be accessible to researchers, industry and a larger audience al ike. Without this kind of infrastructure Norwegian industry and academia will be delayed in developing urgently needed technology like Norwegian and Sami software for speech and language recognition and production, machine translation, and dialogue system s. Both technological and philological research (for the improvement of dictionaries, grammars, and text books) and development depend on efficient available methodology, and the LIA infrastructure will provide a solid step forward to achieve this. All so cietal groups of all ages will benefit from better language technology, for example for getting correct information in one's own language on the internet. People with disabilities are a group that will benefit from better dialogue systems that include aut omatic speech recognition and synthesis. The project leader is a renowned professor of linguistics and an experienced leader, who is part of the core group of the recently opened Centre of Excellence at the UiO, Center for Multilingualism in Society acros s the Lifespan (MultiLing), which will offer an excellent international and scientific environment for a collaborative infrastructure project of this scale. The LIA project is further strengthened by its extensive national and international collaborative scheme, being a joint effort of the UiO, UiB, UiT and NTNU, plus Norsk Ordbok 2014, the National Library and UNINETT Sigma, with international partners including Humboldt University, University of Wisconsin and Pennsylvania State University, USA, plus the universities of Odense, Uppsala and Gothenburg. LIA will develop an advanced scientific database with digitised sound files, systematised meta-information on informants, place, linguistic properties,etc.

Publications from Cristin

No publications found

Funding scheme:

FORINFRA-Nasj.sats. forskn.infrastrukt