Providing Insights for Queries affected by Failures and Stragglers

02/04/2020
by   Bruhathi Sundarmurthy, et al.
0

Interactive time responses are a crucial requirement for users analyzing large amounts of data. Such analytical queries are typically run in a distributed setting, with data being sharded across thousands of nodes for high throughput. However, providing real-time analytics is still a very big challenge; with data distributed across thousands of nodes, the probability that some of the required nodes are unavailable or very slow during query execution is very high and unavailability may result in slow execution or even failures. The sheer magnitude of data and users increase resource contention and this exacerbates the phenomenon of stragglers and node failures during execution. In this paper, we propose a novel solution to alleviate the straggler/failure problem that exploits existing efficient partitioning properties of the data, particularly, co-hash partitioned data, and provides approximate answers along with confidence bounds to queries affected by failed/straggler nodes. We consider aggregate queries that involve joins, group bys, having clauses and a subclass of nested subqueries. Finally, we validate our approach through extensive experiments on the TPC-H dataset.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/13/2019

Adaptive Learning of Aggregate Analytics under Dynamic Workloads

Large organizations have seamlessly incorporated data-driven decision ma...
research
07/08/2019

In-memory Distributed Spatial Query Processing and Optimization

Due to the ubiquity of spatial data applications and the large amounts o...
research
07/08/2019

LocationSpark: In-memory Distributed Spatial Query Processing and Optimization

Due to the ubiquity of spatial data applications and the large amounts o...
research
10/01/2018

Fault Tolerant Adaptive Parallel and Distributed Simulation through Functional Replication

This paper presents FT-GAIA, a software-based fault-tolerant parallel an...
research
05/31/2022

Communication-efficient distributed eigenspace estimation with arbitrary node failures

We develop an eigenspace estimation algorithm for distributed environmen...
research
02/09/2023

FLAC: A Robust Failure-Aware Atomic Commit Protocol for Distributed Transactions

In distributed transaction processing, atomic commit protocol (ACP) is u...
research
03/21/2020

A Synopses Data Engine for Interactive Extreme-Scale Analytics

In this work, we detail the design and structure of a Synopses Data Engi...

Please sign up or login with your details

Forgot password? Click here to reset