Subscribing to Big Data at Scale

09/10/2020
by   Xikui Wang, et al.
0

Today, data is being actively generated by a variety of devices, services, and applications. Such data is important not only for the information that it contains, but also for its relationships to other data and to interested users. Most existing Big Data systems focus on passively answering queries from users, rather than actively collecting data, processing it, and serving it to users. To satisfy both passive and active requests at scale, users need either to heavily customize an existing passive Big Data system or to glue multiple systems together. Either choice would require significant effort from users and incur additional overhead. In this paper, we present the BAD (Big Active Data) system, which is designed to preserve the merits of passive Big Data systems and introduce new features for actively serving Big Data to users at scale. We show the design and implementation of the BAD system, demonstrate how BAD facilitates providing both passive and active data services, investigate the BAD system's performance at scale, and illustrate the complexities that would result from instead providing BAD-like services with a "glued" system.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 6

page 11

page 18

page 24

02/22/2020

BAD to the Bone: Big Active Data at its Core

Virtually all of today's Big Data systems are passive in nature, respond...
01/06/2021

Bridging BAD Islands: Declarative Data Sharing at Scale

In many Big Data applications today, information needs to be actively sh...
12/19/2017

Passive ans Active Observation: Experimetal Design Issues in Big Data

Data can be collected in scientific studies via a controlled experiment ...
12/19/2017

Passive ans Active Observation: Experimental Design Issues in Big Data

Data can be collected in scientific studies via a controlled experiment ...
12/19/2017

Passive and Active Observation: Experimental Design Issues in Big Data

Data can be collected in scientific studies via a controlled experiment ...
04/23/2018

Succinct Oblivious RAM

Reducing the database space overhead is critical in big-data processing....
02/21/2019

An IDEA: An Ingestion Framework for Data Enrichment in AsterixDB

Big Data today is being generated at an unprecedented rate from various ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.