Prespecification of Structure for Optimizing Data Collection and Research Transparency by Leveraging Conditional Independencies

03/24/2022
by   Matthew J. Vowels, et al.
0

Data collection and research methodology represents a critical part of the research pipeline. On the one hand, it is important that we collect data in a way that maximises the validity of what we are measuring, which may involve the use of long scales with many items. On the other hand, collecting a large number of items across multiple scales results in participant fatigue, and expensive and time consuming data collection. It is therefore important that we use the available resources optimally. In this work, we consider how a consideration for theory and the associated causal/structural model can help us to streamline data collection procedures by not wasting time collecting data for variables which are not causally critical for subsequent analysis. This not only saves time and enables us to redirect resources to attend to other variables which are more important, but also increases research transparency and the reliability of theory testing. In order to achieve this streamlined data collection, we leverage structural models, and Markov conditional independency structures implicit in these models to identify the substructures which are critical for answering a particular research question. In this work, we review the relevant concepts and present a number of didactic examples with the hope that psychologists can use these techniques to streamline their data collection process without invalidating the subsequent analysis. We provide a number of simulation results to demonstrate the limited analytical impact of this streamlining.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/22/2019

Lessons from Archives: Strategies for Collecting Sociocultural Data in Machine Learning

A growing body of work shows that many problems in fairness, accountabil...
research
02/23/2020

"Playing the whole game": A data collection and analysis exercise with Google Calendar

We provide an exercise suitable for early introduction in an undergradua...
research
08/31/2023

The Smart Data Extractor, a Clinician Friendly Solution to Accelerate and Improve the Data Collection During Clinical Trials

In medical research, the traditional way to collect data, i.e. browsing ...
research
04/27/2020

"Unsex me here": Revisiting Sexism Detection Using Psychological Scales and Adversarial Samples

To effectively tackle sexism online, research has focused on automated m...
research
01/16/2021

Model structures and structural identifiability: What? Why? How?

We may attempt to encapsulate what we know about a physical system by a ...
research
09/28/2020

Reactive Supervision: A New Method for Collecting Sarcasm Data

Sarcasm detection is an important task in affective computing, requiring...

Please sign up or login with your details

Forgot password? Click here to reset