Back to search

FORINFRA-Nasj.sats. forskn.infrastrukt

Common Language Resources and Technology Infrastructure Norway

Awarded: NOK 25.0 mill.

The project will realize the Norwegian part of the CLARIN distributed infrastructure. Faced with a heterogeneous landscape of digital language data to be explored and reused, the main challenges are preservation, cataloguing, access and analysis. Grid-based technologies, web services, secure storage and language analysis tools have matured to a point where large amounts of data can be made accessible to researchers across systems and borders. Within Norway, a network of centres will be established, each with clear national responsibilities in their area. A-centres will provide basic infrastructure services to connect to national and international CLARIN nodes, and to find or store and catalogue data securely and persistently. B-centres will provide language data services to filter, present, download or upload data, as well as a wide range of language analysis tools with transparent access to HPC when needed. C-centres will contribute with data, metadata and tools. The project has the necessary expertise, data and tools, but the integration of all these into an integrated an interoperable infrastructure will be a task of national proportion. Data and metadata formats will be standardized and made CLARIN-compatible for central harvesting. Platforms for accessing corpora and digital editions will be upgraded and converted from being search boxes to provide complete data management workflows with aggregation and user-tailored presentation of results. Tools will be adapted and connected flexibly and intelligently to aim for custom analysis of data. The data and metadata themselves will be harvested and preserved in a national archive with PID and secure storage.

Publications from Cristin

No publications found

No publications found

Funding scheme:

FORINFRA-Nasj.sats. forskn.infrastrukt

Funding Sources