ArchaeoDAL: A Data Lake for Archaeological Data Management and Analytics

07/23/2021
by   Pengfei Liu, et al.
0

With new emerging technologies, such as satellites and drones, archaeologists collect data over large areas. However, it becomes difficult to process such data in time. Archaeological data also have many different formats (images, texts, sensor data) and can be structured, semi-structured and unstructured. Such variety makes data difficult to collect, store, manage, search and analyze effectively. A few approaches have been proposed, but none of them covers the full data lifecycle nor provides an efficient data management system. Hence, we propose the use of a data lake to provide centralized data stores to host heterogeneous data, as well as tools for data quality checking, cleaning, transformation, and analysis. In this paper, we propose a generic, flexible and complete data lake architecture. Our metadata management system exploits goldMEDAL, which is the most complete metadata model currently available. Finally, we detail the concrete implementation of this architecture dedicated to an archaeological project.

READ FULL TEXT

page 7

page 10

page 11

research
05/10/2019

Metadata Management for Textual Documents in Data Lakes

Data lakes have emerged as an alternative to data warehouses for the sto...
research
09/03/2021

Joint Management and Analysis of Textual Documents and Tabular Data within the AUDAL Data Lake

In 2010, the concept of data lake emerged as an alternative to data ware...
research
03/24/2021

Coining goldMEDAL: A New Contribution to Data Lake Generic Metadata Modeling

The rise of big data has revolutionized data exploitation practices and ...
research
11/06/2018

Architecture of Distributed Data Storage for Astroparticle Physics

For the successful development of the astrophysics and, accordingly, for...
research
09/14/2019

Harmonise and integrate heterogeneous areal data with the R package arealDB

Areal data is a common data type to store information such as biodiversi...
research
01/24/2023

FUSEE: A Fully Memory-Disaggregated Key-Value Store (Extended Version)

Distributed in-memory key-value (KV) stores are embracing the disaggrega...

Please sign up or login with your details

Forgot password? Click here to reset