Dataset Search In Biodiversity Research: Do Metadata In Data Repositories Reflect Scholarly Information Needs?

02/27/2020
by   Felicitas Löffler, et al.
0

The increasing amount of research data provides the opportunity to link and integrate data to create novel hypotheses, to repeat experiments or to compare recent data to data collected at a different time or place. However, recent studies have shown that retrieving relevant data for data reuse is a time-consuming task in daily research practice. In this study, we explore what hampers dataset retrieval in biodiversity research, a field that produces a large amount of heterogeneous data. We analyze the primary source in dataset search - metadata - and determine if they reflect scholarly search interests. We examine if metadata standards provide elements corresponding to search interests, we inspect if selected data repositories use metadata standards representing scholarly interests, and we determine how many fields of the metadata standards used are filled. To determine search interests in biodiversity research, we gathered 169 questions that researchers aimed to answer with the help of retrieved data, identified biological entities and grouped them into 13 categories. Our findings indicate that environments, materials and chemicals, species, biological and chemical processes, locations, data parameters and data types are important search interests in biodiversity research. The comparison with existing metadata standards shows that domain-specific standards cover search interests quite well, whereas general standards do not explicitly contain elements that reflect search interests. We inspect metadata from five large data repositories. Our results confirm that metadata currently poorly reflect search interests in biodiversity research. From these findings, we derive recommendations for researchers and data repositories how to bridge the gap between search interest and metadata provided.

READ FULL TEXT

page 28

page 31

page 39

page 40

research
05/26/2023

DataChat: Prototyping a Conversational Agent for Dataset Search and Visualization

Data users need relevant context and research expertise to effectively s...
research
08/04/2022

Modeling community standards for metadata as templates makes data FAIR

It is challenging to determine whether datasets are findable, accessible...
research
08/17/2018

The Variable Quality of Metadata About Biological Samples Used in Biomedical Experiments

We present an analytical study of the quality of metadata about samples ...
research
02/17/2021

DICODerma: A practical approach for metadata management of images in dermatology

Clinical images are vital for diagnosing and monitoring skin diseases, a...
research
05/04/2020

EngMeta – Metadata for Computational Engineering

Computational engineering generates knowledge through the analysis and i...
research
11/18/2022

Toward a Flexible Metadata Pipeline for Fish Specimen Images

Flexible metadata pipelines are crucial for supporting the FAIR data pri...
research
08/21/2018

User Interests in German Social Science Literature Search - A Large Scale Log Analysis

The social sciences are a broad research field with a lot of sub- and re...

Please sign up or login with your details

Forgot password? Click here to reset