An Approach to Handle Big Data Warehouse Evolution

09/12/2018
by   Darja Solodovnikova, et al.
0

One of the purposes of Big Data systems is to support analysis of data gathered from heterogeneous data sources. Since data warehouses have been used for several decades to achieve the same goal, they could be leveraged also to provide analysis of data stored in Big Data systems. The problem of adapting data warehouse data and schemata to changes in these requirements as well as data sources has been studied by many researchers worldwide. However, innovative methods must be developed also to support evolution of data warehouses that are used to analyze data stored in Big Data systems. In this paper, we propose a data warehouse architecture that allows to perform different kinds of analytical tasks, including OLAP-like analysis, on big data loaded from multiple heterogeneous data sources with different latency and is capable of processing changes in data sources as well as evolving analysis requirements. The operation of the architecture is highly based on the metadata that are outlined in the paper.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/02/2023

An OPC UA-based industrial Big Data architecture

Industry 4.0 factories are complex and data-driven. Data is yielded from...
research
02/28/2018

Apache Calcite: A Foundational Framework for Optimized Query Processing Over Heterogeneous Data Sources

Apache Calcite is a foundational software framework that provides query ...
research
08/09/2021

Towards a Generic Multimodal Architecture for Batch and Streaming Big Data Integration

Big Data are rapidly produced from various heterogeneous data sources. T...
research
10/24/2019

Toward a view-based data cleaning architecture

Big data analysis has become an active area of study with the growth of ...
research
07/05/2021

Data Lake Ingestion Management

Data Lake (DL) is a Big Data analysis solution which ingests raw data in...
research
10/05/2021

Data Validation for Big Live Data

Data Integration of heterogeneous data sources relies either on periodic...
research
04/24/2018

On-Demand Big Data Integration: A Hybrid ETL Approach for Reproducible Scientific Research

Scientific research requires access, analysis, and sharing of data that ...

Please sign up or login with your details

Forgot password? Click here to reset