Metadata Systems for Data Lakes: Models and Features

09/20/2019
by   Pegdwendé Sawadogo, et al.
0

Over the past decade, the data lake concept has emerged as an alternative to data warehouses for storing and analyzing big data. A data lake allows storing data without any predefined schema. Therefore, data querying and analysis depend on a metadata system that must be efficient and comprehensive. However, metadata management in data lakes remains a current issue and the criteria for evaluating its effectiveness are more or less nonexistent.In this paper, we introduce MEDAL, a generic, graph-based model for metadata management in data lakes. We also propose evaluation criteria for data lake metadata systems through a list of expected features. Eventually, we show that our approach is more comprehensive than existing metadata systems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/10/2019

Metadata Management for Textual Documents in Data Lakes

Data lakes have emerged as an alternative to data warehouses for the sto...
research
03/24/2021

Coining goldMEDAL: A New Contribution to Data Lake Generic Metadata Modeling

The rise of big data has revolutionized data exploitation practices and ...
research
07/23/2021

On data lake architectures and metadata management

Over the past two decades, we have witnessed an exponential increase of ...
research
03/12/2021

Comprehensive and Comprehensible Data Catalogs: The What, Who, Where, When, Why, and How of Metadata Management

Scalable data science requires access to metadata, which is increasingly...
research
04/25/2011

Bayesian approach for near-duplicate image detection

In this paper we propose a bayesian approach for near-duplicate image de...
research
09/03/2021

Joint Management and Analysis of Textual Documents and Tabular Data within the AUDAL Data Lake

In 2010, the concept of data lake emerged as an alternative to data ware...
research
10/17/2020

Automated Metadata Harmonization Using Entity Resolution Contextual Embedding

ML Data Curation process typically consist of heterogeneous federate...

Please sign up or login with your details

Forgot password? Click here to reset