MORF: A Framework for MOOC Predictive Modeling and Replication At Scale

01/16/2018
by   Josh Gardner, et al.
0

The MOOC Replication Framework (MORF) is a novel software system for feature extraction, model training/testing, and evaluation of predictive dropout models in Massive Open Online Courses (MOOCs). MORF makes large-scale replication of complex machine-learned models tractable and accessible for researchers, and enables public research on privacy-protected data. It does so by focusing on the high-level operations of an extract-train-test-evaluate workflow, and enables researchers to encapsulate their implementations in portable, fully reproducible software containers which are executed on data with a known schema. MORF's workflow allows researchers to use data in analysis without providing them access to the underlying data directly, preserving privacy and data security. During execution, containers are sandboxed for security and data leakage and parallelized for efficiency, allowing researchers to create and test new models rapidly, on large-scale multi-institutional datasets that were previously inaccessible to most researchers. MORF is provided both as a Python API (the MORF Software), for institutions to use on their own MOOC data) or in a platform-as-a-service (PaaS) model with a web API and a high-performance computing environment (the MORF Platform).

READ FULL TEXT
research
10/11/2021

Privacy preserving local analysis of digital trace data: A proof-of-concept

We present PORT, a software platform for local data extraction and analy...
research
06/13/2018

Enabling End-To-End Machine Learning Replicability: A Case Study in Educational Data Mining

The use of machine learning techniques has expanded in education researc...
research
04/17/2023

A Decentralized Authorization and Security Framework for Distributed Research Workflows

Research challenges such as climate change and the search for habitable ...
research
02/25/2022

DataLab: A Platform for Data Analysis and Intervention

Despite data's crucial role in machine learning, most existing tools and...
research
01/19/2022

RAMANMETRIX: a delightful way to analyze Raman spectra

Although Raman spectroscopy is widely used for the investigation of biom...
research
12/15/2021

or2yw: Modeling and Visualizing OpenRefineHistories as YesWorkflow Diagrams

OpenRefine is a popular open-source data cleaning tool. It allows users ...
research
07/09/2022

A novel evaluation methodology for supervised Feature Ranking algorithms

Both in the domains of Feature Selection and Interpretable AI, there exi...

Please sign up or login with your details

Forgot password? Click here to reset