Notebook-as-a-VRE (NaaVRE): from private notebooks to a collaborative cloud virtual research environment

11/24/2021
by   Zhiming Zhao, et al.
0

Virtual Research Environments (VREs) provide user-centric support in the lifecycle of research activities, e.g., discovering and accessing research assets, or composing and executing application workflows. A typical VRE is often implemented as an integrated environment, which includes a catalog of research assets, a workflow management system, a data management framework, and tools for enabling collaboration among users. Notebook environments, such as Jupyter, allow researchers to rapidly prototype scientific code and share their experiments as online accessible notebooks. Jupyter can support several popular languages that are used by data scientists, such as Python, R, and Julia. However, such notebook environments do not have seamless support for running heavy computations on remote infrastructure or finding and accessing software code inside notebooks. This paper investigates the gap between a notebook environment and a VRE and proposes an embedded VRE solution for the Jupyter environment called Notebook-as-a-VRE (NaaVRE). The NaaVRE solution provides functional components via a component marketplace and allows users to create a customized VRE on top of the Jupyter environment. From the VRE, a user can search research assets (data, software, and algorithms), compose workflows, manage the lifecycle of an experiment, and share the results among users in the community. We demonstrate how such a solution can enhance a legacy workflow that uses Light Detection and Ranging (LiDAR) data from country-wide airborne laser scanning surveys for deriving geospatial data products of ecosystem structure at high resolution over broad spatial extents. This enables users to scale out the processing of multi-terabyte LiDAR point clouds for ecological applications to more data sources in a distributed cloud environment.

READ FULL TEXT

page 1

page 6

page 7

page 8

research
03/19/2018

Data provenance tracking as the basis for a biomedical virtual research environment

In complex data analyses it is increasingly important to capture informa...
research
05/04/2021

Architecture of a Flexible and Cost-Effective Remote Code Execution Engine

Oftentimes, there is a need to experiment with different programming lan...
research
03/31/2017

The Eclipse Integrated Computational Environment

Problems in modeling and simulation require significantly different work...
research
12/09/2019

Lightweight Container-based User Environment

Modern operating systems all support multi-users that users could share ...
research
04/11/2017

Toward a new approach for massive LiDAR data processing

Laser scanning (also known as Light Detection And Ranging) has been wide...
research
01/21/2021

Cloud-Based Content Cooperation System to Assist Collaborative Learning Environment

Online educational systems running on smart devices have the advantage o...
research
01/23/2018

CaosDB - Research Data Management for Complex, Changing, and Automated Research Workflows

Here we present CaosDB, a Research Data Management System (RDMS) designe...

Please sign up or login with your details

Forgot password? Click here to reset