Frost: Benchmarking and Exploring Data Matching Results

07/22/2021
by   Martin Graf, et al.
0

"Bad" data has a direct impact on 88 losing 12 representations of the same real-world entities - are among the main reasons for poor data quality. Therefore, finding and configuring the right deduplication solution is essential. Various data matching benchmarks exist which address this issue. However, many of them focus on the quality of matching results and neglect other important factors, such as business requirements. Additionally, they often do not specify how to explore benchmark results, which helps understand matching solution behavior. To address this gap between the mere counting of record pairs vs. a comprehensive means to evaluate data matching approaches, we present the benchmark platform Frost. Frost combines existing benchmarks, established quality metrics, a benchmark dimension for soft KPIs, and techniques to systematically explore and understand matching results. Thus, it can be used to compare multiple matching solutions regarding quality, usability, and economic aspects, but also to compare multiple runs of the same matching solution for understanding its behavior. Frost is implemented and published in the open-source application Snowman, which includes the visual exploration of matching results.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/12/2017

Image Matching: An Application-oriented Benchmark

Image matching approaches have been widely used in computer vision appli...
research
07/03/2023

A Critical Re-evaluation of Benchmark Datasets for (Deep) Learning-Based Matching Algorithms

Entity resolution (ER) is the process of identifying records that refer ...
research
01/23/2023

WDC Products: A Multi-Dimensional Entity Matching Benchmark

The difficulty of an entity matching task depends on a combination of mu...
research
05/12/2022

Bridging the Gap between Reality and Ideality of Entity Matching: A Revisiting and Benchmark Re-Construction

Entity matching (EM) is the most critical step for entity resolution (ER...
research
09/06/2023

FishMOT: A Simple and Effective Method for Fish Tracking Based on IoU Matching

The tracking of various fish species plays a profoundly significant role...
research
05/09/2018

Creative Invention Benchmark

In this paper we present the Creative Invention Benchmark (CrIB), a 2000...

Please sign up or login with your details

Forgot password? Click here to reset