Visually overviewing biodiversity open data digital collections


After years of mass digitisation initiatives in Natural History institutions, large biodiversity collections have emerged on the web as open data. Studies on climate change and nature conservation rely heavily on this data to understand the distribution, presence/absence, changes over time, and interaction of species, and community ecology. For the institutions that hold this data, the exploration and verification of the records they produce are critical to support new modes of studying, analysing, and accessing biodiversity information. However, the process of data verification is challenging given the complex relationships between the data. This poses difficulties to the diagnosis of completeness, correctness, and good coverage of the domain. To this day, there is no clear understanding of to what extent existing visualization techniques can systematically support the task of data verification. To support research in this area, this paper reviews the visualisation solutions by focusing on a function-based visual exploration concept that can be integrated into a data verification pipeline for biodiversity datasets. Beyond reviewing the state of the art, we describe a data verification pipeline following such concept for biodiversity collections of the National Museum/Federal University of Rio de Janeiro, Brazil. The pipeline is targeted to domain expert users in supporting strategic decisions on data maintenance, as well as also having the potential to support general users in contextualising the datasets.

