Approximate Computation for Big Data Analytics

01/02/2019
by   Shuai Ma, et al.
0

Over the past a few years, research and development has made significant progresses on big data analytics. A fundamental issue for big data analytics is the efficiency. If the optimal solution is unable to attain or not required or has a price to high to pay, it is reasonable to sacrifice optimality with a `good' feasible solution that can be computed efficiently. Existing approximation techniques can be in general classified into approximation algorithms, approximate query processing for aggregate SQL queries and approximation computing for multiple layers of the system stack. In this article, we systematically introduce approximate computation, i.e., query approximation and data approximation, for efficiency and effectiveness big data analytics. We first explain the idea and rationale of query approximation, and show efficiency can be obtained with high effectiveness in practice with three analytic tasks: graph pattern matching, trajectory compression and dense subgraph computation. We then explain the idea and rationale of data approximation, and show efficiency can be obtained even without sacrificing for effectiveness in practice with three analytic tasks: shortest paths/distances, network anomaly detection and link prediction.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/11/2022

The Principle of Least Sensing: A Privacy-Friendly Sensing Paradigm for Urban Big Data Analytics

With the worldwide emergence of data protection regulations, how to cond...
research
01/15/2022

Towards a Conceptual Approach of Analytical Engineering for Big Data

Analytics corresponds to a relevant and challenging phase of Big Data. T...
research
08/20/2018

Loss Data Analytics

Loss Data Analytics is an interactive, online, freely available text. Th...
research
11/08/2021

"If we didn't solve small data in the past, how can we solve Big Data today?"

Data is a critical aspect of the world we live in. With systems producin...
research
12/17/2021

Reproducible and Portable Big Data Analytics in the Cloud

Cloud computing has become a major approach to help reproduce computatio...
research
07/29/2018

MISS: Finding Optimal Sample Sizes for Approximate Analytics

Nowadays, sampling-based Approximate Query Processing (AQP) is widely re...
research
12/12/2015

Active Sampler: Light-weight Accelerator for Complex Data Analytics at Scale

Recent years have witnessed amazing outcomes from "Big Models" trained b...

Please sign up or login with your details

Forgot password? Click here to reset