Leveraging Data Preparation, HBase NoSQL Storage, and HiveQL Querying for COVID-19 Big Data Analytics Projects

04/01/2020
by   Karim Baïna, et al.
0

Epidemiologist, Scientists, Statisticians, Historians, Data engineers and Data scientists are working on finding descriptive models and theories to explain COVID-19 expansion phenomena or on building analytics predictive models for learning the apex of COVID-19 confimed cases, recovered cases, and deaths evolution curves. In CRISP-DM life cycle, 75 preparation phase causing lot of pressions and stress on scientists and data scientists building machine learning models. This paper aims to help reducing data preparation efforts by presenting detailed schemas design and data preparation technical scripts for formatting and storing Johns Hopkins University COVID-19 daily data in HBase NoSQL data store, and enabling HiveQL COVID-19 data querying in a relational Hive SQL-like style.

READ FULL TEXT
research
05/30/2023

Visual Exploratory Data Analysis of the Covid-19 Pandemic in Nigeria: Two Years after the Outbreak

The outbreak of the coronavirus disease in Nigeria and all over the worl...
research
01/03/2019

Landscape of Big Medical Data: A Pragmatic Survey on Prioritized Tasks

Big medical data poses great challenges to life scientists, clinicians, ...
research
08/20/2022

Visual Exploratory Data Analysis of the Covid-19 Vaccination Progress in Nigeria

The coronavirus outbreak in 2020 devastated the world's economy, includi...
research
05/31/2023

Managed Geo-Distributed Feature Store: Architecture and System Design

Companies are using machine learning to solve real-world problems and ar...
research
05/25/2020

A Big Data Based Framework for Executing Complex Query Over COVID-19 Datasets (COVID-QF)

COVID-19's rapid global spread has driven innovative tools for Big Data ...
research
04/30/2020

A Multi-Dimensional Big Data Storing System for Generated COVID-19 Large-Scale Data using Apache Spark

The ongoing outbreak of coronavirus disease (COVID-19) had burst out in ...
research
01/07/2022

Similarities and Differences between Machine Learning and Traditional Advanced Statistical Modeling in Healthcare Analytics

Data scientists and statisticians are often at odds when determining the...

Please sign up or login with your details

Forgot password? Click here to reset