A note on 'Collider bias undermines our understanding of COVID-19 disease risk and severity' and how causal Bayesian networks both expose and resolve the problem

05/18/2020
by   Norman Fenton, et al.
0

An important recent preprint by Griffiths et al highlights how 'collider bias' in studies of COVID19 undermines our understanding of the disease risk and severity. This is typically caused by the data being restricted to people who have undergone COVID19 testing, among whom healthcare workers are overrepresented. For example, collider bias caused by smokers being underrepresented in the dataset can explain empirical results which claim that smoking reduces the risk of COVID19. We extend the work of Griffiths et al making more explicit use of graphical causal models to interpret observed data. We show yhat their smoking example can be clarified and improved using Bayesian network models with realistic data and assumptions. We show that there is an even more fundamental problem for risk factors like 'stress' which, unlike smoking, is more rather than less prevalent among healthcare workers; in this case, because of a combination of collider bias from the biased dataset and the fact that 'healthcare worker' is a confounding variable, it is likely that studies will wrongly conclude that stress reduces rather than increases the risk of COVID19. To avoid such erroneous conclusions, any analysis of observational data must take account of the underlying causal structure including colliders and confounders. If analysts fail to do this explicitly then any conclusions they make about the effect of specific risk factors on COVID19 are likely to be flawed if they are based only on data from people who have been tested.

READ FULL TEXT
research
07/16/2020

The role of collider bias in understanding statistics on racially biased policing

Contradictory conclusions have been made about whether unarmed blacks ar...
research
06/03/2021

Sample Selection Bias in Evaluation of Prediction Performance of Causal Models

Causal models are notoriously difficult to validate because they make un...
research
04/18/2023

On clustering levels of a hierarchical categorical risk factor

Handling nominal covariates with a large number of categories is challen...
research
05/10/2021

An introduction to causal reasoning in health analytics

A data science task can be deemed as making sense of the data and/or tes...
research
06/04/2021

Odds Ratios are far from "portable": A call to use realistic models for effect variation in meta-analysis

Objective: Recently Doi et al. argued that risk ratios should be replace...
research
02/20/2021

Designing Experiments Informed by Observational Studies

The increasing availability of passively observed data has yielded a gro...
research
05/16/2018

Magnitude of selection bias in road safety epidemiology, a primer

In the field of road safety epidemiology, it is common to use responsibi...

Please sign up or login with your details

Forgot password? Click here to reset