A Framework for Evaluating Snippet Generation for Dataset Search

07/02/2019
by   Xiaxia Wang, et al.
0

Reusing existing datasets is of considerable significance to researchers and developers. Dataset search engines help a user find relevant datasets for reuse. They can present a snippet for each retrieved dataset to explain its relevance to the user's data needs. This emerging problem of snippet generation for dataset search has not received much research attention. To provide a basis for future research, we introduce a framework for quantitatively evaluating the quality of a dataset snippet. The proposed metrics assess the extent to which a snippet matches the query intent and covers the main content of the dataset. To establish a baseline, we adapt four state-of-the-art methods from related fields to our problem, and perform an empirical evaluation based on real-world datasets and queries. We also conduct a user study to verify our findings. The results demonstrate the effectiveness of our evaluation framework, and suggest directions for future research.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/29/2019

Towards More Usable Dataset Search: From Query Characterization to Snippet Generation

Reusing published datasets on the Web is of great interest to researcher...
research
06/09/2022

MIMICS-Duo: Offline Online Evaluation of Search Clarification

Asking clarification questions is an active area of research; however, r...
research
11/24/2020

Code Search Intent Classification Using Weak Supervision

Developers use search for various tasks such as finding code, documentat...
research
02/11/2021

To Reuse or Not To Reuse? A Framework and System for Evaluating Summarized Knowledge

As the amount of information online continues to grow, a correspondingly...
research
05/18/2021

Wizard of Search Engine: Access to Information Through Conversations with Search Engines

Conversational information seeking (CIS) is playing an increasingly impo...
research
01/13/2021

Empirical Evaluation of User Experience Using Lean Product and Process Development: A Public Institution Case Study in Indonesia

The easiness, speed, accuracy, security are the four main indicators of ...
research
12/07/2022

Metric Elicitation; Moving from Theory to Practice

Metric Elicitation (ME) is a framework for eliciting classification metr...

Please sign up or login with your details

Forgot password? Click here to reset