Toward a view-based data cleaning architecture

10/24/2019
by   Toshiyuki Shimizu, et al.
0

Big data analysis has become an active area of study with the growth of machine learning techniques. To properly analyze data, it is important to maintain high-quality data. Thus, research on data cleaning is also important. It is difficult to automatically detect and correct inconsistent values for data requiring expert knowledge or data created by many contributors, such as integrated data from heterogeneous data sources. An example of such data is metadata for scientific datasets, which should be confirmed by data managers while handling the data. To support the efficient cleaning of data by data managers, we propose a data cleaning architecture in which data managers interactively browse and correct portions of data through views. In this paper, we explain our view-based data cleaning architecture and discuss some remaining issues.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/12/2018

An Approach to Handle Big Data Warehouse Evolution

One of the purposes of Big Data systems is to support analysis of data g...
research
04/24/2018

On-Demand Big Data Integration: A Hybrid ETL Approach for Reproducible Scientific Research

Scientific research requires access, analysis, and sharing of data that ...
research
03/11/2020

Crop Knowledge Discovery Based on Agricultural Big Data Integration

Nowadays, the agricultural data can be generated through various sources...
research
06/02/2017

ICABiDAS: Intuition Centred Architecture for Big Data Analysis and Synthesis

Humans are expert in the amount of sensory data they deal with each mome...
research
06/02/2023

An OPC UA-based industrial Big Data architecture

Industry 4.0 factories are complex and data-driven. Data is yielded from...
research
08/09/2021

Towards a Generic Multimodal Architecture for Batch and Streaming Big Data Integration

Big Data are rapidly produced from various heterogeneous data sources. T...
research
05/31/2021

Gradient-based Data Subversion Attack Against Binary Classifiers

Machine learning based data-driven technologies have shown impressive pe...

Please sign up or login with your details

Forgot password? Click here to reset