SparkGOR: A unified framework for genomic data analysis

08/31/2020
by   Sigmar K. Stefánsson, et al.
0

Motivation: Our goal was to combine the capabilities of Spark and GOR into a single computing framework for use in analysis of large scale genome data. Results: We have created a relational query engine that unites SparkSQL and GORpipe into a single declarative query framework. This has been achieved by allowing embedding of SQL expressions into the high-level relational statement syntax in GOR and by supporting virtual relations and nested GORpipe expressions within SQL. Furthermore, we have built drivers to enable Spark and GOR to use and leverage their preferred file formats, Parquet and GORZ respectively, and introduced APIs to allow the use of GOR with Spark dataframes. Availability: The SparkGOR version of the GORpipe software is open-source and freely available at https://gorpipe-website.now.sh and https://github.com/gorpipe.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/16/2022

Translating Canonical SQL to Imperative Code in Coq

SQL is by far the most widely used and implemented query language. Yet, ...
research
10/25/2019

Rumble: data independence when data is in a mess

This paper introduces Rumble, an engine that executes JSONiq queries on ...
research
01/15/2019

Integrazione di Apache Hive con Spark

English. This document describes the solutions adopted, which arose from...
research
05/06/2019

Mixing set and bag semantics

The conservativity theorem for nested relational calculus implies that q...
research
10/18/2019

An Open-Source Toolbox for Computer-Aided Investigation on the Fundamental Limits of Information Systems, Version 0.1

We provide an open source toolbox at https://github.com/ct2641/CAI/relea...
research
08/03/2019

Searching for Ambiguous Objects in Videos using Relational Referring Expressions

Humans frequently use referring (identifying) expressions to refer to ob...
research
12/01/2017

Optimization of Imperative Programs in a Relational Database

For decades, RDBMSs have supported declarative SQL as well as imperative...

Please sign up or login with your details

Forgot password? Click here to reset