Making Machine Learning Datasets and Models FAIR for HPC: A Methodology and Case Study

11/03/2022
by   Pei-Hung Lin, et al.
0

The FAIR Guiding Principles aim to improve the findability, accessibility, interoperability, and reusability of digital content by making them both human and machine actionable. However, these principles have not yet been broadly adopted in the domain of machine learning-based program analyses and optimizations for High-Performance Computing (HPC). In this paper, we design a methodology to make HPC datasets and machine learning models FAIR after investigating existing FAIRness assessment and improvement techniques. Our methodology includes a comprehensive, quantitative assessment for elected data, followed by concrete, actionable suggestions to improve FAIRness with respect to common issues related to persistent identifiers, rich metadata descriptions, license and provenance information. Moreover, we select a representative training dataset to evaluate our methodology. The experiment shows the methodology can effectively improve the dataset and model's FAIRness from an initial score of 19.1

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/11/2021

Fair AutoML

We present an end-to-end automated machine learning system to find machi...
research
06/26/2023

LM4HPC: Towards Effective Language Model Application in High-Performance Computing

In recent years, language models (LMs), such as GPT-4, have been widely ...
research
07/08/2020

Whither Fair Clustering?

Within the relatively busy area of fair machine learning that has been d...
research
10/21/2021

MLPerf HPC: A Holistic Benchmark Suite for Scientific Machine Learning on HPC Systems

Scientific communities are increasingly adopting machine learning and de...
research
04/12/2023

Constructing a Searchable Knowledge Repository for FAIR Climate Data

The development of a knowledge repository for climate science data is a ...
research
08/28/2018

Investigating Human + Machine Complementarity for Recidivism Predictions

When might human input help (or not) when assessing risk in fairness-rel...
research
11/20/2019

Towards FAIR protocols and workflows: The OpenPREDICT case study

It is essential for the advancement of science that scientists and resea...

Please sign up or login with your details

Forgot password? Click here to reset