ensemblQueryR: fast, flexible and high-throughput querying of Ensembl LD API endpoints in R

by   Aine Fairbrother-Browne, et al.

We present ensemblQueryR, a package providing an R interface to the Ensembl REST API that facilitates flexible, fast, user-friendly and R workflow integrable querying of Ensembl REST API linkage disequilibrium (LD) endpoints, optimised for high-throughput querying. ensemblQueryR achieves this through functions that are intuitive and amenable to custom code integration, use of familiar R object types as inputs and outputs, code optimisation and optional parallelisation functionality. For each LD endpoint, ensemblQueryR provides two functions, permitting both single-query and multi-query modes of operation. The multi-query functions are optimised for large query sizes and provide optional parallelisation to leverage available computational resources and minimise processing time. We demonstrate that ensemblQueryR has improved performance in terms of random access memory (RAM) usage and speed, delivering a 10-fold speed increase over analogous software whilst using a third of the RAM. Finally, ensemblQueryR is near-agnostic to operating system and computational architecture through availability of Docker and singularity images, making this tool widely accessible to the scientific community.


page 8

page 9


Workflows in AiiDA: Engineering a high-throughput, event-based engine for robust and modular computational workflows

Over the last two decades, the field of computational science has seen a...

kiwiPy: Robust, high-volume, messaging for big-data and computational science workflows

In this work we present kiwiPy, a Python library designed to support rob...

Reducing shared memory footprint to leverage high throughput on Tensor Cores and its flexible API extension library

NVIDIA Tensor Core is a mixed-precision matrix-matrix multiplication and...

Ceibaco: REST API and Single Page Application for the generation and evaluation of bijective S-boxes

In this paper we present the first REST API for the generation and evalu...

PatchSorter: A High Throughput Deep Learning Digital Pathology Tool for Object Labeling

The discovery of patterns associated with diagnosis, prognosis, and ther...

AiiDA 1.0, a scalable computational infrastructure for automated reproducible workflows and data provenance

The ever-growing availability of computing power and the sustained devel...

mage: Fluid Moves Between Code and Graphical Work in Computational Notebooks

We aim to increase the flexibility at which a data worker can choose the...

Please sign up or login with your details

Forgot password? Click here to reset