Monitoring the development of the environment in the face of rapid global change requires massive amounts of data collected over large spatial scales. Processing these massive amounts of data requires the adoption of a wide range of digital tools because it is not possible to do this work manually because of cost and manpower constraints. In other words, gaining access to “Big Data” requires a lot of technical tools. One element concerns the development of databases and dataflow procedures that allow data to be uploaded, processed and stored in secure environments. As part of this work the Big Picture project is working with three main data storage platforms. “Agouti” is developed and maintained by Wageningen University in the Netherlands as a centralised data storage platform. “Trapper” is developed by the Mammal Research Institute in Poland, but is an open source solution which is maintained on multiple institutional servers by different users. A third system, “Scandcam” is maintained by the Norwegian Institute for Nature Research as a closed institutional solution for their own data and specific needs. The Big Picture project is helping all three of these platforms to improve their functionality in a coordinated manner.
The Big Picture concept is based on the idea that individual institutions will need to adopt solutions that work for their specific needs. However, in order to bring data together from different systems for a common analysis there is a need develop a common data package standard that permits data export in a standard manner to ensure interoperability. Accordingly, the project has developed and published Camtrap DP as a data standard that is now published and integrated as an export format into all the systems. Further developments that are now underway include the development of an interface with the Global Biodiversity Information Facility (GBIF) to enable a metadata hub and permit global data sharing.
The greatest potential for an increase in data-sharing efficiency concerns the adoption of Artificial Intelligence. The primary need is to use AI to identify and delete / obscure images of humans as well as remove blank images. The secondary need is to identify the species of wild animal in the image. Following this are a whole host of additional options such as estimating the distance of the animal from the camera, identifying the age and sex of the animal in the image, and extracting information from the background such as snow cover or vegetation greenness. Currently the first issue is enabled in most data processing / storage platforms. Good AI models for species recognition such as DeepFaun, developed by our French partners exist, and are currently undergoing final modifications to include the full range of species present across Europe. These final versions will then be integrated into the data processing platforms. At present their accuracy varies from 85 to 95% depending on the species. The remaining question is how to deal with this uncertainty when analysing data or if it is necessary to keep human validation in the loop to reduce the errors further. The additional AI applications are in various degrees of development, but none have yet been integrated into the platforms.
The lessons learnt so far are that the benefits of digitalisation are real, but are not low hanging fruits. Many years of development have gone into producing functional databases and developing AI models. Just because a technology exists does not mean it is ready for immediate application. However, an international project like Big Picture brings multiple research teams together to exchange experience and combine their efforts in a more efficient way to produce practical solutions that work.
Finding pathways for human-biodiversity coexistencein Europe requires up-to-date knowledge on species status, distribution, relative abundance, and their interactions with humans. Effective conservation requires continental scale coordination, which requires continental scale data. This can only be achieved if we avail of methods that (1) can target many species at the same time, and (2) can make use of data collected for many different purposes by a diversity of professional and citizen scientists. Digital camera traps are one such tool, the use of which has exploded in recent years. However, the state of data processing tools and data sharing procedures are not yet developed to allow an efficient classification, storage, and sharing of this data. It is also unclear to what extent data collected under different regimes can be compared. This project proposes a set of four interlinked workpackages that will; (1) Explore legal, institutional, and social contraints on data sharing with a view to identify pathways that faciliate making data as open and available as possible. (2) Develop efficient and AI-enabled database structures that facilitate the efficient processing of raw data, the safe storage of the data, and export formats that conform to emerging data standards to facilitate data sharing and comparative analysis. (3) An exploration of statistical analysis tools and procedures that find ways to maximise the integration of data collected under different protocols into common analysis, essentially determining which data, on which species, can be used to determine which inferences. (4) A set of demonstration analyses that reveal the possibility and added-value that can be obtained when data is pooled across projects and countries. These illustrative analyses will cover a range of biodiversity policy areas, including One Health, Climate Change, Invasive Species, Natura 2000 site management, and conservation of Habitat Directive listed species.