Exploring Data Pipelines through the Process Lens: a Reference Model forComputer Vision

07/05/2021
by   Agathe Balayn, et al.
0

Researchers have identified datasets used for training computer vision (CV) models as an important source of hazardous outcomes, and continue to examine popular CV datasets to expose their harms. These works tend to treat datasets as objects, or focus on particular steps in data production pipelines. We argue here that we could further systematize our analysis of harms by examining CV data pipelines through a process-oriented lens that captures the creation, the evolution and use of these datasets. As a step towards cultivating a process-oriented lens, we embarked on an empirical study of CV data pipelines informed by the field of method engineering. We present here a preliminary result: a reference model of CV data pipelines. Besides exploring the questions that this endeavor raises, we discuss how the process lens could support researchers in discovering understudied issues, and could help practitioners in making their processes more transparent.

READ FULL TEXT
research
04/19/2023

Towards Building Child-Centered Machine Learning Pipelines: Use Cases from K-12 and Higher-Education

Researchers and policy-makers have started creating frameworks and guide...
research
05/05/2022

Replicating Data Pipelines with GrimoireLab

In this paper, we present our MSR Hackathon 2022 project that replicates...
research
06/28/2023

Towards Language Models That Can See: Computer Vision Through the LENS of Natural Language

We propose LENS, a modular approach for tackling computer vision problem...
research
12/19/2019

Data Science through the looking glass and what we found there

The recent success of machine learning (ML) has led to an explosive grow...
research
05/26/2018

Time-Shared Execution of Realtime Computer Vision Pipelines by Dynamic Partial Reconfiguration

This paper presents an FPGA runtime framework that demonstrates the feas...
research
09/11/2023

Exploring Minecraft Settlement Generators with Generative Shift Analysis

With growing interest in Procedural Content Generation (PCG) it becomes ...

Please sign up or login with your details

Forgot password? Click here to reset