Enabling End-To-End Machine Learning Replicability: A Case Study in Educational Data Mining

06/13/2018
by   Josh Gardner, et al.
0

The use of machine learning techniques has expanded in education research, driven by the rich data from digital learning environments and institutional data warehouses. However, replication of machine learned models in the domain of the learning sciences is particularly challenging due to a confluence of experimental, methodological, and data barriers. We discuss the challenges of end-to-end machine learning replication in this context, and present an open-source software toolkit, the MOOC Replication Framework (MORF), to address them. We demonstrate the use of MORF by conducting a replication at scale, and provide a complete executable container, with unique DOIs documenting the configurations of each individual trial, for replication or future extension at https://github.com/educational-technology-collective/fy2015-replication. This work demonstrates an approach to end-to-end machine learning replication which is relevant to any domain with large, complex or multi-format, privacy-protected data with a consistent schema.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/30/2022

Bio-inspired Machine Learning: programmed death and replication

We analyze algorithmic and computational aspects of biological phenomena...
research
01/16/2018

MORF: A Framework for MOOC Predictive Modeling and Replication At Scale

The MOOC Replication Framework (MORF) is a novel software system for fea...
research
06/25/2020

Replication-Robust Payoff-Allocation with Applications in Machine Learning Marketplaces

The ever-increasing take-up of machine learning techniques requires ever...
research
08/08/2023

HotOS XIX Panel Report: Panel on Future of Reproduction and Replication of Systems Research

At HotOS XIX (2023), we organized a panel to discuss the future of repro...
research
04/20/2023

Replication and Verifiability in Requirements Engineering: the NLP for RE Case

[Context] Study replication is essential for theory building and empiric...
research
11/08/2019

Collaborative Machine Learning Markets with Data-Replication-Robust Payments

We study the problem of collaborative machine learning markets where mul...
research
05/19/2020

Identifying Statistical Bias in Dataset Replication

Dataset replication is a useful tool for assessing whether improvements ...

Please sign up or login with your details

Forgot password? Click here to reset