High-Performance Mining of COVID-19 Open Research Datasets for Text Classification and Insights in Cloud Computing Environments

09/16/2020
by   Jie Zhao, et al.
0

COVID-19 global pandemic is an unprecedented health crisis. Since the outbreak, many researchers around the world have produced an extensive collection of literatures. For the research community and the general public to digest, it is crucial to analyse the text and provide insights in a timely manner, which requires a considerable amount of computational power. Clouding computing has been widely adopted in academia and industry in recent years. In particular, hybrid cloud is gaining popularity since its two-fold benefits: utilising existing resource to save cost and using additional cloud service providers to gain assess to extra computing resources on demand. In this paper, we developed a system utilising the Aneka PaaS middleware with parallel processing and multi-cloud capability to accelerate the ETL and article categorising process using machine learning technology on a hybrid cloud. The result is then persisted for further referencing, searching and visualising. Our performance evaluation shows that the system can help with reducing processing time and achieving linear scalability. Beyond COVID-19, the application might be used directly in broader scholarly article indexing and analysing.

READ FULL TEXT

page 1

page 3

page 6

page 7

research
03/04/2019

CloudPSS: A High-Performance Power System Simulator Based on Cloud Computing

With the increasing computations in power system simulations, high-perfo...
research
03/07/2019

Benefits of AWS in Modern Cloud

This article gives an overview of the benefits of AWS in the modern clou...
research
10/28/2021

Using Text Analytics for Health to Get Meaningful Insights from a Corpus of COVID Scientific Papers

Since the beginning of COVID pandemic, there have been around 700000 sci...
research
05/07/2019

Transferable Knowledge for Low-cost Decision Making in Cloud Environments

Users of cloud computing are increasingly overwhelmed with the wide rang...
research
02/06/2022

A Summary of COVID-19 Datasets

This research presents a review of main datasets that are developed for ...
research
07/22/2022

Ten Lessons for Data Sharing With a Data Commons

A data commons is a cloud-based data platform with a governance structur...
research
04/11/2023

A Framework for Successful Corporate Cloud Transformation

Corporate Cloud transformation is expected to continue to grow double-di...

Please sign up or login with your details

Forgot password? Click here to reset