Dataset search: a survey

01/03/2019
by   Adriane Chapman, et al.
0

Generating value from data requires the ability to find, access and make sense of datasets. There are many efforts underway to encourage data sharing and reuse, from scientific publishers asking authors to submit data alongside manuscripts to data marketplaces, open data portals and data communities. Google recently beta released a search service for datasets, which allows users to discover data stored in various online repositories via keyword queries. These developments foreshadow an emerging research field around dataset search or retrieval that broadly encompasses frameworks, methods and tools that help match a user data need against a collection of datasets. Here, we survey the state of the art of research and commercial systems in dataset retrieval. We identify what makes dataset search a research field in its own right, with unique challenges and methods and highlight open problems. We look at approaches and implementations from related areas dataset search is drawing upon, including information retrieval, databases, entity-centric and tabular search in order to identify possible paths to resolve these open problems as well as immediate next steps that will take the field forward.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/26/2023

DataFinder: Scientific Dataset Recommendation from Natural Language Descriptions

Modern machine learning relies on datasets to develop and validate resea...
research
05/26/2023

DataChat: Prototyping a Conversational Agent for Dataset Search and Visualization

Data users need relevant context and research expertise to effectively s...
research
01/09/2021

Ribonucleic acid (RNA) virus and coronavirus in Google Dataset Search: their scope and epidemiological correlation

This paper presents an analysis of the publication of datasets collected...
research
07/21/2017

Searching Data: A Review of Observational Data Retrieval Practices

A cross-disciplinary examination of the user behaviours involved in seek...
research
01/07/2022

SaL-Lightning Dataset: Search and Eye Gaze Behavior, Resource Interactions and Knowledge Gain during Web Search

The emerging research field Search as Learning investigates how the Web ...
research
12/29/2022

A Survey of Diversification Techniques in Search and Recommendation

Diversifying search results is an important research topic in retrieval ...
research
08/05/2020

Retrieve Synonymous keywords for Frequent Queries in Sponsored Search in a Data Augmentation Way

In sponsored search, retrieving synonymous keywords is of great importan...

Please sign up or login with your details

Forgot password? Click here to reset